This is the example frontmatter file. Use it for your abstract, dedications, acknowledgements etc.
Data is an oft-used word that carries multiple meanings. In everyday speech, it might refer to mobile phone bandwidth, a filled application form or a collection of files. Even experts have a variety of definitions of data, as well as the related concepts of information and knowledge (Zins, 2015). In this study, we refer to data by its accepted definition as information or knowledge stored in a form suitable for computer processing. Wellisch expressed this as ‘the representation of concepts or other entities, fixed in or on a medium in a form suitable for communication, interpretation, or processing by human beings or by automated systems’ (Wellisch, 1996), which is a useful definition as it includes the fact that both humans and algorithms can use data, and that data is something that needs interpretation.
From a strict grammatical stance, ‘data’ is a plural of the singular ‘datum’ thus it is more correct to write ‘the data are correct’ - but this usage is rapidly declining from use (‘Data’, no date) and throughout this thesis I use the more widely adopted usage of treating data as a singular mass noun, as in ‘the data is correct’.
The concepts of ‘data’ and ‘information’ are closely related, so much so that they are often used interchangeably. Ackoff presented a model for distinguishing data, information, knowledge, understanding/intelligence and wisdom, in which he describes data as the physical symbols, effectively the 1’s and 0’s stored in a computer or the ink marks on a page, which becomes useful when humans or algorithms are able to deduce facts from those symbols to answer simple questions - at this point it becomes ‘information’. Being able to interpret deeper how and why questions allow information to become knowledge and understanding, towards the ultimate goal of wisdom (Ackoff, 1989). This is often represented as the DIKW pyramid (DIKW being shorthand for the data-information-knowledge-wisdom transformation that occurs as you move up through the layers), the origin of which is unknown (Wallace, 2007). Figure 1 builds upon a representation by George Pór (Pór, 1997) of the pyramid as a ‘wisdom curve’, showing how increasing meaning and value can be obtained from data as deeper questions can be asked of it. This theme of obtaining meaning and value from data is an important aspect of my research that I will refer back to.
This model that turning data into information can be thought of as using that data to answer questions is consistent with the idea that “information can be thought of as the resolution of uncertainty” (‘Information’, no date). The exact origin of this definition is unknown but it is often attributed to mathematician Claude Shannon (Shannon, 1948). Indeed from an etymological stance, one who is informed is one who has received knowledge or concepts as a result of what has been communicated to them. Thus we can consider that data is the material from which information can be received. It follows also that data contains uncertainty that must be resolved in order for it to become meaningful information.
The earliest computer systems used data to store mathemical and scientific facts. Data processing allowed for previously manual operations to be performed with greater speed and accuracy, most famously the work of Alan Turing and the case of the Enigma code breakers during World War II (Hutton, 2012). This work was the advent of general-purpose computing - machines that could be applied to any problem provided you could reduce that problem to data. Businesses over the following decades began to apply computers to myriad new problem areas in all different fields of work and life, and doing so began the encoding of information about people as data, be it for statistical purposes like censuses or research, or simply to enable the more efficient serving of customers by storing databases of customer records.
The personal computer revolution (‘The personal computer revolution’, no date) of the late 1970s and 1980s put computers in every office and eventually every home too, and it soon became commonplace that each individual would have data stored about them in companies’ databases. In the subsequent years three factors have combined to accelerate this trend of storing data about people: i) labour costs have remained high and companies have sought ways to automate their businesses and to implement online services and call centres in place of in-person staff interaction, ii) computer processing and storage has become ever cheaper thanks to the advent of cloud computing, meaning that many business processes could be reduced to data processing tasks or entire businesses be moved online, and iii) the rise of smartphones and web-enabled devices have meant that the public are now ready and willing to conduct much of their daily business online through the web and apps. These factors have encouraged both commercial and civic providers to centralise their services and to ‘go digital’ to the greatest degree possible. In doing so they collect ever more data about people (now ‘service users’ or just ‘users’). Data is now seen as a resource which can be mined for value, and harnessed for profit and business efficiency - ‘the new oil’ (Toonders, 2014). Zuboff, in her 2019 book on ‘surveillance capitalism’, characterises this new digital world as the collection of human behaviour data so that it can be used as free raw material and converted into profit through hyper-personalised advertising and targeting by software platforms (Zuboff, 2019). This philosophy is also known as ‘data-ism’ (Brooks, 2013) and the analysis and exploitation of such data at scale is known as ‘big data’ (Neef, 2015).
As a result of data-ism, the collection of data about people has become an inevitable part of modern life. We live ‘digital lives’ (Various Authors, 2018) where we each interact directly and indirectly with hundreds of digital systems every day - as you shop, socialise, or browse online; as you listen to music or watch TV; as you interact with governments or healthcare services; as you travel, and many more. Every one of those interactions indicates the presence of data about you stored in a company database. Every aspect of our lives involves the input, processing and output of data – either provided by, collected from, or generated about, us. And the digital data we create and consume (whether consciously or not - data sharing is often unwitting (Crabtree and Tolmie, 2018)) has a direct influence on our lived experience - from decisions about what we are entitled to and what opportunities we will be offered, to the advertisements and content recommendations we are shown while we browse.
In 2017, the average American Internet user had 150 online user accounts with different providers (Caruthers, 2018). Data for the UK show the number of service and supply relationships each individual has to manage increasing from around 45 in 1997 to around 250 in 2020 (Henderson and Group, 2020). As the amount of personal data relating to each of us has increased, the need for individuals to be able to manage this has grown, but unfortunately, the large-scale systems which collect data about us now function as ‘data traps’ (Abiteboul, André and Kaplan, 2015) - where data about us is easily gathered but very hard to remove or even to access. This creates a lack of agency for the individuals living in this data-centric world. The World Economic Forum’s “Rethinking Personal Data” project recognised the critical role that data, (specifically personal data - data created by and about people) now holds, and identified that “an asymmetry of power exists today […] created by an imbalance in the amount of information about individuals held by industry and governments, and the lack of knowledge and ability of the same individuals to control the use of that information” (Hoffman, 2011, 2013, 2014b, 2014a).
Since as early as 1973, the need to protect individuals’ rights over their data has been recognised (US Department of Health Education and Welfare, 1973). The 37-nation organisation OECD in 1980 stated that “the right of individuals to access and challenge personal data is […] the most important privacy protection safeguard” and issued recommendations that individuals should be given basic privacy rights, including the right to be informed whether data is stored about them, and the right to an intelligible copy of that data (Organisation for Economic Co-operation and Development, 1980).
Over the subsequent decades, lawmakers began to enact laws to deliver these rights to individuals, notably the UK’s Data Protection Act 1984 (which set up an independent body, the Data Protection Registrar (now the Information Commissioner’s Office) with which organisations were required to register their usage of personal data), Ireland’s Data Protection Act 1988 (which introduced the concept of a ‘duty of care’ for data collectors - that they are expected to avoid causing damage or distress to data subjects), the EU’s Data Protection Directive in 1995 and the UK’s Data Protection Act in 1998. However, such laws were generally found to be ineffective - in 2002 Simon Davies, director of Privacy International said that the UK’s DPA was “almost useless in limiting the growth of surveillance” (Millar, 2002).
It was only in 2018, when the EU’s General Data Protection Regulation (GDPR) came into force, carrying with it significant designed-to-hurt fines for non-compliance (Kelly, 2020; Leprince-Ringuet, 2021), that individuals have been able to practically exercise their data rights to any meaningful degree (‘The GDPR: Does it Benefit Consumers in Any Practical Way?’, 2020). The GDPR – which gives individuals key rights including rights to timely data access, explanation, erasure and correction (Information Commissioner’s Office, 2018) – can be seen as the first serious attempt to rebalance the aforementioned power imbalance over data between citizens and organisations and is generally regarded as a landmark piece of legislation and a strong template for individual data protection. Around the world, companies have overhauled their privacy policies and updated their business practices to comply with the GDPR and other similar legislation, such as Japan’s 2017 Act on the Protection of Personal Information, India’s 2019 Personal Data Protection Bill and the 2020 California Consumer Protection Act. In the USA, there has been no national privacy law yet, but the GDPR’s influence is being felt in court rulings (Hoofnagle, Sloot and Borgesius, 2019).
Following the Snowden revelations (Gellman, 2013) in 2013, attention and concern over personal data use has grown year on year. In 2018, the Cambridge Analytica scandal (‘Facebook–Cambridge Analytica Data Scandal’, 2014; Chang, 2018) broke; the personal data of 87 million people, acquired from Facebook, was exploited with the apparent intent of influencing voting outcomes including the UK’s 2016 Brexit referendum and the USA’s 2017 election of Donald Trump. This combined with widespread public information campaigns about GDPR have led to a heightened awareness of personal data rights (European Union Agency for Fundamental Rights, 2020) and at the time of writing in 2021, personal data protection laws and individual digital rights remain a rapidly evolving area.
From the GDPR and its antecedents, a number of key terms have been established which I will adopt in this thesis, specifically (Information Commissioner’s Office, 2014; The European Parliament and the Council of the European Union, 2016a):
The World Economic Forum called in 2011 for a balanced ecosystem around personal data, and identified transparency as a key principle needed to achieve this: People need to know what data is captured, how it is captured, how it will be used and analysed and who has access to it. Additionally people must understand the value created by the use of their data and the way in which they are compensated for this (Hoffman, 2011). It is almost impossible for people to assess that value, because they are unaware of most of their data (Spiekermann and Korunovska, 2017). Having awareness of your personal data is a critical first step, so that people might assess “to what extent the bargain is fair” (Larsson, 2018). In this regard, the GDPR can be seen as an important step in the right direction, as it requires data controllers to document their data practices and to provide data copies.
However, it is not sufficient simply to grant data subjects the technical or procedural capabilities to see the stored records about them. Access must be effective. Every individual must have the knowledge, skills and structures in place that enable them to achieve their objectives with their personal data (Gurstein, 2003). Gurstein later identified seven aspects that are necessary for access to be effective (Gurstein, 2011) and to avoid a ‘data divide’ of those who can harness their data and those who cannot:
Unfortunately people’s ability to derive value from their data, or to assess its value is limited; it is an asset over which we have little control. Our existing data ‘resides in isolated silos kept apart by technical incompatibilities, semantic fuzziness, organizational barriers [and] privacy regulations’. This lack of effective data access is detrimental to trust, innovation and growth (Abiteboul, André and Kaplan, 2015).
Beyond these operational concerns over effective access, there are practical limitations affecting people’s ability to make use of their data. Where people are given interfaces their data, access is typically via a list or feed combined with a search box. Studies have shown that people prefer to find information by orienteering rather than search - associatively traversing related datapoints (Teevan et al., 2004; Karger and Jones, 2006). Having our documents distributed across multiple platforms, applications and devices makes interrogation and orienteering hard (Krishnan and Jones, 2005). Abowd and Mynatt highlight that in presenting information about people and their activities, everyday computing needs to address the facts that users activities rarely have a clear beginning or end, are often interrupted, are often concurrent with other activities; that time is an important factor in finding and interpreting information; and that associative modelling of information is more useful than hierarchical models, because future usage goals cannot always be anticipated (Abowd and Mynatt, 2000). Recognising these needs, Krishnan and Jones identify that an effective information access system should support giving historical context, finding trends and patterns, time-based contextual retrieval, automatic structuring and multiple perspectives of the information (Krishnan and Jones, 2005). Shneiderman, in the context of considering the effectiveness of interactive information visualisations, identified the need to support seven types of information interaction: overview, pan & zoom, focus (context & distortion), detail on demand, filter, relate, history and extract (Shneiderman, 1996). While any one of the capabilities mentioned in this paragraph does exist in at least some data interfaces today, it is clear that no such general-purpose personal information access system exists with all or even most of those capabilities exists today. The development and state of the art in the field of Personal Information Management Systems is explored in section 2.2 below.
In this section, I have described the establishment of the data-centric world in which we live today, the imbalance this creates between data subjects and data controllers, and what can be viewed as nascent attempts by governments to redress that imbalance through the creation of new laws. I have also outlined where research thinking has exceeded the practical data capabilities we have today, in identifying many factors and capabilities that should be considered when it comes to giving people a meaningful relationship with their personal data.
To date, people’s relationship with their personal data and the information within it has barely been explored. What mental models to people have around data? What value does it carry to them and what meaningful place does it (or should it) hold in their life? What is it that makes data meaningful and what do people want from their data? What is it like to live in this data-centric world where your abilities over your data are limited by lack of access to data and a lack of suitable interfaces and technologies to properly manage your digital life? This is one aspect of the research gap this thesis will address - discovering the human experience of data.
In the immediate aftermath of the second World War, Dr. Vannevar Bush wrote a landmark article for The Atlantic Monthly in which he envisioned a new scientific agenda for America and the world - to harness new general information-processing capabilities of computers to make the stored knowledge of mankind accessible and usable to all, for the betterment of society. He proposed the ‘Memex’, a device in which people would store their books, communications and records digitally so that it “might be consulted with exceeding speed and flexibility” - a personal filing system to serve as “an enlarged intimate supplement to his memory”. He emphasised the importance of allowing information to be stored in “associative chains of related materials” so that people would be able to retrieve information in the same way we think of it, traversing related items or ideas (Bush, 1945). During the next three decades, while computer systems were moving out of science labs and being established in workplaces as a means to automate and improve business processes, researchers began to look beyond usage in business and consider how computers might be used by ‘the common man’ to store one’s personal information in digital files (Nelson, 1965), for interpersonal communication (Shannon, 1948), to augment human intellect (Engelbart, 1962) and to model human thought (Simon and Newell, 1958).
Collectively, these constituted a recognition that computers could be considered a general-purpose tool that anyone could use for their own purposes, and in the 1970s and 1980s the home computer revolution (‘The personal computer revolution’, no date) seemed to place the potential power that “having reduced your affairs to software, software can take care of them for you” (Gelernter, 1994) into the hands of ordinary people.
Through the examination of people’s desk-based working practices, researchers began to understand how people handle information to inform the design of computer information systems. In 1983, Thomas Malone observed that categorisation is hard, and that any system must not only help the user to find information, but also remind the user of things to do. Computers could help through automatic classification, but should also allow both physical and logical “piles” of information to be arranged by the user (Malone, 1983). Personal Information Management (PIM) was first mentioned in 1988 by Mark Lansdale, who identified a need to design information management systems according to the psychology of the people who use them rather than by simulating office practices. By paying attention to how people categorise, recognise and recall information, and labelling information with appropriate attributes, information can be retrieved by different properties (Lansdale, 1988). PIM includes both directly interacting with digital files, webpages and e-mails as well as ‘meta-activities’ such as finding, arranging, searching, browsing, re-finding, categorising, sensemaking, keeping and discarding personal information. William Jones summarised PIM as “the art of getting things done in our lives through information” (W. Jones, 2011a).
Driven in part by the pursuit of better “time management” in the late 20th century (characterised by PDAs, palmtops and electronic organisers) (Etzel, 1995) and the focus on personal productivity in the early 2000s (characterised by ‘GTD’ (Getting Things Done) self-help books and to-do list software) (Andrews, 2005) and the continuing challenge of overcoming information overload in an increasingly digital world, PIM has been a thriving field both in research and in practice, with a peak in activity around the mid ’00s. Since the 1990s, numerous PIM system designs have emerged, each exhibiting some of the following six traits which I will now explain: Spatial, Semantic, Networked, Temporal, Contextual and Subjective.
Spatial PIM systems are based on the idea that people remember “where” they have put things and that this allows information to be quickly returned by associating it with a place (Negroponte and Bolt, 1978), much as as people keep current information ‘in reach’ on a desk (Klein et al., 2004). Spatial approaches recognise that keeping is a valuable activity in its own right, that informs sensemaking (Marshall and Jones, 2006). Placed information also performs an important reminding function (Barreau, 1995; Barreau and Nardi, 1995).
Building on Bush’s ideas of “associative chains of related materials”, networked PIM systems focus on the relationships between data. HyperText, as conceived in 1965 (Nelson, 1965) was designed to keep connections between information and allow the computer to understand what linked information is. The version of hypertext we use today is much weaker than Nelson’s HyperText or Berners-Lee’s Semantic Web and does not achieve these goals, as the inventors agree (Ross, 2005; Nelson, 2006; Ziogas, 2020). In the absence of connected networks of personal information and with people collecting more information than they discard (Whittaker and Hirschberg, 2001), the 2000s saw software like Google Desktop Search (‘Google Desktop Search’, 2004) and Infovark (‘Infovark Company Profile’, 2007) emerge to try and discover users’ data files and unify access to them, with limited impact (Bergman et al., 2008). Around this time, Microsoft invented WinFS, a system to re-invent the modern day operating system to be based upon relational structured data rather than file storage, but sadly it was never released (‘WinFS’, no date). Paul Dourish et. al. proposed Placeless Documents, which relied on the idea of assigning user-specific properties to documents so that their could be arranged and recalled by their common properties rather than their location (Dourish et al., 2000; Dourish, 2003). Metadata – information about what the data is – is critical to information organisation (Foulonneau and Riley, 2008). One of the more advanced networked PIM systems is the Networked Semantic Desktop, which recognises that critical metadata is lost when files are copied or emailed, and attempts to maintain metadata and traceability by integrating PIM with peer-to-peer (P2P) technology (Decker and Frank, 2004). Tags, which emerged as a means to organise data through systems like del.icio.us (‘Delicious’, 2003) and Flickr in the 2000s, are still widely used on social media and websites today, and are even available within macOS (Frost, 2019). Tags can be seen as a continuation of attempts to attach metadata to personal data to give it meaning, even though the dream of “folksonomies” has not been fully realised (Abbattista et al., 2007; Terdiman, 2008).
Semantic PIM systems, or “The Semantic Desktop” as it is often known, takes the idea of metadata even deeper and focuses on what the information means. The idea is to present an integrated view of a person’s stored knowledge by representing their documents, data and messages as URL-addressable semantic web resources (Sauermann, Bernardi and Dengel, 2005). The focus is on both the retrieval of documents and of facts (Schumacher, Sintek and Sauermann, 2008). This implicitly means that the computer must know more about what the data it stores represents, elevating it from number cruncher to something that holds a collection of information about the world. Hendler and Berners-Lee see semantic web technologies as the building blocks for a new age of social machines(Hendler and Berners-Lee, 2010), machines that operate in society at an information level. This desire to give computers greater understanding of data has created emergent industries focused on using linguistics and statistics to perform content analysis, text mining and information extraction (Hotho, Nürnberger and Paaß, 2005). It has even been proposed that AI might help computers to understand users’ mental models (Nadeem and Sauermann, 2007).
While folders have emerged as the dominant means to organise computer files and are effective because they allow you to arrange information according to its meaning to you (Bergman et al., 2012; Bergman, 2013), supporters of temporal PIM systems argue they are inadequate as an organising device. Freeman and Gelernter proposed Lifestreams, a PIM system based on the principled that storage should be transparent, archiving and compatibility should be automatic, and concise overviews of groups of related information should be available (Freeman and Gelernter, 1996). Central to this system is the idea that personal data can most easily be navigated when viewed as a timeline, partly because almost all data can be associated to a specific time, but also because this maps onto the idea of relating personal information to human memory (Lansdale and Edmonds, 1992). TimeSpace provides another model of a PIM system that organises personal information by both time and the user’s own activities, to support interaction with a “continuously changing and evolving information space” (Krishnan and Jones, 2005). Time-based PIM approaches also coincide with a drive to move beyond files as a system of information storage. Gelernter believed we should not have to put effort into organising files, and argued somewhat prophetically that commercial factors have skewed personal data systems design away from the realities of human lives (Steinberg, 1997). In my own 2011 article “Why files need to die”, I mapped out how a personalised timeline could allow better personal information organisation and retrieval (Bowyer, 2011). Echoing this as well as Decker’s desire to maintain an information trail for every piece of information, Siân Lindley et. al., having called for time to become a subject of design research in its own right (Odom et al., 2018), explored the concept of the file biography, a concept which allows the history of information to be kept as the file is used and changed. File biographies tell a story, and help to reconfigure our thinking away from mindsets around copying, deleting and sharing, to view information as fluid (Lindley et al., 2018). Moving into the world of online information collaboration, activity streams can also be seen as a recognition of the importance of tracking data as it changes, and offer new affordances (Hart-Davidson, Zachry and Spinuzzi, 2012).
In 1995, Barreau highlighted the importance of context to PIM; People need access to different information according to what they are doing (Barreau, 1995) In 2000, Abowd and Mynatt highlighted the importance of paying attention to the user’s context in order to offer access to the most relevant information and features, and they suggest context can be identified by considering the “5 W’s” - who, where, what, when and why (Abowd and Mynatt, 2000). Context-aware computing (Abowd et al., 1999; Eliasson, Cerratto Pargman and Ramberg, 2009) has subsequently emerged as a sub-discipline of research in its own right (Dey, 2001) (see also section 2.3.2). Dourish identified that context is both a problem of representation, in that it is information that can be captured and represented, and of interaction, in that it is a relational property between objects or activities. He calls for embodied interaction - allowing users to create their own practices and meanings in the course of their PIM system interaction, noting that context is not objective and predetermined, it arises from the activity (Dourish, 2004); you need different organisations of information in different contexts. This means that PIM systems need to support representing a given set of information in different ways (Lansdale and Edmonds, 1992) - but more that than, that different information should be shown according to the current context; different perspectives are needed to segment your life. TimeSpace uses ‘activity workspaces’ to achieve this (Krishnan and Jones, 2005), but Karger et. al.’s Haystack system refines the concept further, introducing the concept of lenses. Perspectives change which information records are included, whereas lenses allow you to focus on different attributes of what might be the same or different information (Karger et al., 2005). Using a similar premise, Jilek’s “context spaces” system attempted a dynamically reorganising contextual sidebar, but is limited in flexibility as it uses rigid types for specific contexts (Jilek et al., 2018). Lindley observes that different information abstractions are needed for different audiences, from which we can infer that in a multi-user system, no single arrangement of information will suffice because in the same context two people may have different needs (Lindley et al., 2018).
This is why the sixth trait of PIM systems is important: subjectivity. Information organisation cannot be handled in a deterministic, objective manner. Any PIM system must be tailored to, and adaptable by, the user. Shipman and Marshall found that forcing users into explicit information models or workflows is harmful to user experience, and that interactive systems have to address the challenge of being just explicit enough but still allowing for differences in individual mental models (Shipman and Marshall, 1999). Bergman et. al. (Bergman, Beyth-Marom and Nachmias, 2003) proposed three principles for subjective PIM, and their 2003 assertion that these principles are not currently well implemented in PIM systems remains true today:
Teevan’s take on PIM subjectivity is important: “The user should feel in control of the information”. She argues that this can be done by “understanding what conceptual anchors the user creates and keeping them constant while the data changes.” (Teevan, 2001). With semantic PIM systems, we can see that a successful system (or at least, its designers) must understand a great deal about their users.
In the late ‘00s, researchers and enthusiasts took PIM beyond task management and turned PIM thinking toward the self. In pursuit of Bush’s vision of augmenting human memory, Jim Gemmell and Gordon Bell in their MyLifeBits project at Microsoft (Gemmell, Bell and Lueder, 2006; Bell and Gemmell, 2009) tried to capture an entire life electronically. This became known as lifelogging: gathering as much data as possible, so that the maximum possible context, detail and understanding can be gained about that individual. In 2007, tech writers Kevin Kelly and Gary Wolf set out a vision for what they called the Quantified Self, that is, to achieve increased self-knowledge through self-tracking, not just of physical metrics such as step counts, heart rates or calories burned, but almost any aspect of your own life that could be numerically recorded in a computer (Kelly and Wolf, 2007). The Quantified Self movement (QSM) is now a world-wide community of enthusiasts who have developed hundreds of tools and techniques for self-tracking/lifelogging and monitoring themselves through data for the purposes of self improvement, and also has a non-profit organisation aiming to ’advance discovery through increasing access to data’ (‘About The Quantified Self’, no date). Around 2009, researcher Ian Li began writing about what he called personal informatics, noting that it can be difficult to know ourselves due to incomplete self-knowledge, difficulties in monitor our own behaviours, and being too busy to introspect. He proposes that “Computers can help: They can store large amounts of data, analyse the data for patterns, visualise the data, and provide feedback at opportune times (Li, 2009).” Just as QSM has gained traction with enthusists in the general public, so personal informatics has grown as an area of research, development and study in academic circles. While QSM and lifelogging focus slightly more on capturing data about oneself and personal informatics focuses slightly more on the mechanisms of integrating and reviewing self-tracking data, there is so much overlap that all three can be considered the same field, which for convenience I will refer to by the shorthand self informatics (SI) throughout this thesis. SI can be seen as a distinct advancement from PIM because of its focus on using personal information for personal benefit. SI can be seen as the antithesis of corporate data-centric motives outlined in 2.1 - as here, data is gathered for the data subject’s benefit rather than that of the data-gathering organisation.
Li, Dey and Forlizzi conducted participatory research with SI practitioners and identified five stages of personal informatics systems (which can be seen as refinement of William Jones’ list (W. Jones, 2011b) of the six activities involved in PIM). The five stages, illustrated in Figure 2, each of which can be driven by the user, the SI system or both, are:
Of these, reflection is perhaps the most important, as the capacity to gain new insight is the motivating reason to engage in SI. Reflective learning (Boud, Keogh and Walker, 1985) has been recognised as a valuable means of knowledge acquisition and improvement in a variety of contexts including education (Dewey, 1938), business (Beck et al., 2001), and research (Lewin, 1946). In the context of the wisdom curve (see Figure 1 above), reflection can be seen as asking questions of data in order to acquire knowledge about oneself. Knowledge about oneself (a.k.a. self-insight (Hixon and Swann, 1993)) serves not only to satisfy curiosity (Li, Dey and Forlizzi, 2010) but can improve self-control (O’Donoghue and Rabin, 2001), increase self-awareness (Aslam et al., 2016) and enable positive behaviours such as saving energy (Seligman and Darley, 1976).
Reflection can be facilitated in SI systems by enabling the tracking of subjective factors such as mood, health or activity, and can be triggered by means of notifications, or during more direct information exploration by the user as they recall or revisit experiences (Rivera-Pelayo et al., 2012). To aid interpretation of data by SI users, contextualisation, enhancing information with additional facts to ease its comprehension. This can include social, spatial or historical context, subjective or objective metadata or external sources of information (e.g. weather) (Rivera-Pelayo et al., 2012), or external devices (Dey, 2000). There are two phases of reflection, discovery and maintenance. During the initial discovery phase, typical questions that SI users ask concern the history of data changes, understanding the context of a datapoint, the factors that cause a pattern in data, and the identification of suitable goals to pursue. During the maintenance phase, these goals frame the questions asked, which concern status (how well you are doing at meeting your goals) and discrepancies (examining the difference between actual behaviour and desired behaviours).
In order for a SI user to successfully reach this maintenance phase where they can continue to reflect upon their actions and adjust their goals, they must have been able to successfully navigate each of the 5 phases illustrated in Figure 2; if they have not collected the right data, they cannot integrate it, if they have not been able to integrate the collected data in a meaningful way, they cannot reflect upon it, and so on. Li et. al. framed this the barriers cascade (Li, Dey and Forlizzi, 2010), and the pursuit of new ways to overcome these barriers has in effect been the major problem space for all SI approaches; this is especially evident in the QSM (Choe et al., 2014). While effortless SI is not yet a reality and many barriers still exist, progress in easing the SI journey through the barriers cascade is being made: in 2011, Jones had noted that people often postpone or don’t have time for meta-level information management activities (W. Jones, 2011a), but by 2019 the increased automation around self-tracking and data collection was judged to have given people more free time and energy for reflection and managing their goals (Feng and Agosto, 2019).
As described in 2.1.2 above, the rise of data-centrism has meant that every aspect of our lives now involves digital service providers and products which process personal data. Smartphones put computers in everyone’s pockets, and cheap cloud computing and an open web allowed every organisation to serve the population digitally through apps and websites. In 2010, broadband access was declared a legal right in Finland (‘Finland: Broadband Access Made Legal Right In Landmark Law’, 2010), and in 2011, the UK Supreme Court declared that Internet access was an “essential part of everyday living” and denial of Internet access for criminals such as sex offenders was ruled unlawful (Roche, 2011; Wagner, 2012). Everyone now required access to information and online digital services. “The boundary between real life and online [had] disappeared” (Burkeman, 2011). The promise that whatever you want to do “there’s an app for that” had become true (Apple, 2009). During the late ’00s and throughout the 2010s data-centric companies disrupted almost every industry: Amazon (shopping & books), Uber (taxis), Netflix (movie rental), Spotify (music), AirBNB (accommodation), Google (email, news & advertising), Facebook (social networking & advertising), Paypal/Revolut/Monzo (banking), match/Tinder (dating), Steam (computer games), Just Eat (takeaways), and many more (Levine, 2011; Carter, 2015). As a result, we now produce rich data trails simply by going about our daily lives, and this has become “the driving force for value creation” online (Symons et al., 2017). More recently as we start the 2020s, the trend has accelerated, with the COVID-19 pandemic necessitating the move of both information work and social activities to online using platforms such as Zoom, Google Docs and miro (O’Donnell, 2020).
Throughout the transition to this information economy, the computing industry has delivered revolutionary new capabilities, but with every provider offering their own apps and websites, the information landscape has become hugely challenging for people to manage; information overload is now a serious problem that has been linked to increased anxiety, impaired critical thinking, exhaustion, and loss of willpower and focus (Hemp, 2009; Tunikova, 2018; Fu et al., 2020). Our personal information is fragmented and a unified interface is needed: “We must launch multiple applications and perform numerous repetitive searches for relevant information, to say nothing of deciding which applications to look in (Karger and Jones, 2006).” In the silo-ed world of today’s Internet, this has only got worse. Bergman’s subjective principles (see above) imply that our data should be able to move and be referenced freely, but it cannot. Our ability to share and connect data is limited (Crabtree and Tolmie, 2018). Our data is trapped (Abiteboul, André and Kaplan, 2015), not only because it is held by organisations without giving us effective access, but also by various practical means such as format incompatibilities, device restrictions, paywalls, and a lack of data portability. We need to free our data, as I expand upon (Bowyer, 2018).
It is clear that general-purpose computing has yet to provide people with the tools to manage their complex digital lives. There have been attempts to create general purpose interfaces for personal data, typically based around a timeline, such as AllOfMe.com (‘AllofMe Company Profile’, 2007; ‘AllofMe.com Teaser Clip’, 2008) in 2008 and myTimeline a decade later (‘myTimeline’, 2018); however none of these products have reached public availability. To date the closest market-successful tool that people have for general purpose information handling is Facebook, given that it can store personal information, handle asynchronous and instant messaging, news, photo sharing, some retail functionality, brand interaction & support, calendaring and event management, and group discussions. However, it is a closed system with no capability for customisation; none of its content is available outside the network and external content cannot be linked or interacted with except by import; as such it cannot be considered a PIM system. Its own Timeline feature, promoted at launch in 2011 as “the story of your life” and “a new way to express who you are” (Siegler, 2011) has been retired, along with many other tools designed to make information easier to manage such as personal news feeds and friend lists (Perez, 2018), a reminder that Facebook exists primarily to serve its advertisers, rather than the general public, as per the often-repeated saying “if you’re not paying for it, you are the product”. The most promising area for the development of interfaces for managing digital lives is the emerging “personal data locker” space, explored more in 2.3.4 below, which offer the promise of “a place for personal data”, as Jones imagined PIM should be (W. Jones, 2011a), though at time of writing these are still quite limited. As Abiteboul noted in 2015, “everyone should be able to manage their personal data with a personal information management system” (Abiteboul, André and Kaplan, 2015), but as of yet, in any meaningful or holistic way, they cannot, because no general-purpose personal information management system for modern day digital lives exists.
In this section, I have detailed the ways in which personal information management systems have developed, and shown that they have not kept pace with the ever-more-complex needs of the Information Age. Most PIM systems treat data as a static resource to be filed and accessed much like you would a file in a 1970s office. Most digital services operate in isolation from each other, without any shared representation or co-operative understanding of an individual’s personal information. Where personal data access is provided, it is limited in usage to the delivery of the specific service on offer, it is treated as a property asset and the data is not participatory. As Katie Shilton writes, “Much of the social impact of participatory personal data will depend on how data are captured and organized; who has access; whether individuals consent and participate; and how (or whether) data are curated and preserved (Shilton, 2011).” We need “fundamental changes in the way we represent and manipulate data” (Karger and Jones, 2006); we need holistic representations of data that can be subjectively meaningful and which allow for the constant change and evolution of data over time.
Of particular importance is that we recognise that people exist in an interconnected world of relationships - with other individuals, and with organisations, and that the role of data within those relationships needs to be examined. When your data is held by others, managing personal information is not just a matter of arranging your own bookshelves, but rather a multi-party negotiation over representation, ownership, access and consent. Data is a shared resource with multiple users, and only a few researchers have begun to look at people’s interactions with data in this context (for example, activity streams (Hart-Davidson, Zachry and Spinuzzi, 2012), social sensemaking (Puussaar, Clear and Wright, 2017), and decentralised file storage (Zichichi, Ferretti and D’Angelo, 2020)). There has been negligible research into the role data plays within human relationships.
This is the second research gap that my thesis aims to address - to look at personal data holistically in the context of your life. How does the holding of personal data by third parties affect people’s ability to function in modern life? Do people have meaningful control over their personal data in this multi-party landscape? What practical problems do data-holding organisations current practices cause for people? What role should data take in our complex digital lives?
Up until the 1980s, the only reasons to consider the relationship between a human and the computer they were using were ergonomics, comfort and efficiency. People were shielded from the complexities of the machines they were using–the machine did the work and the human was just the operator. In the 1990s, the “first wave” of what is now known as Human-Computer Interaction (HCI) recognised humans as actors operating in groups, who had tasks to perform either using or assisted by technology (Bannon, 1995). People were now users of technology. Design thinking shifted from machine-centric to user-centric design (UCD), motivated by the goal of helping the user to do their tasks better. In the personal computer revolution of the 1990s, people began to work in complex and varied multi-user situations, and observation and understanding of a user’s working environment provided empathy that enabled better design. There was a recognition that people use computers differently in different contexts. In the 2000s, as smartphones, broadband and Web 2.0 brought computing into every aspect of our lives, HCI’s third wave looked beyond the workplace to consider users as unique humans with emotions and culture; design became about experiences (Bødker, 2006) which could span work, mobile and home domains. Computers were no longer just for work. This created a “chaos of multiplicity for HCI in terms of use technologies, use situations, methods and concepts” (Bødker, 2015); designers would now need to “embrace people’s whole lives” (Bødker, 2006). The blueprint for how this could be achieved was to be found in Mark Weiser’s seminal 1991 Scientific American article “The Computer for the 21st century”, in which he envisioned a world where data could be accessed across many different devices, such that interfaces and interactions could be designed around the user’s data needs in specific contexts. He recognised the need to put humans, not machines, at the centre of data interaction, and that in order to achieve “calm computing”, technology would need to “disappear into the background” of our lives (Weiser, 1991; Weiser and Brown, 1996).
Weiser’s vision was significant because it recognised the need for data to transcend the confines of a single machine; to satisfy human needs in different contexts, data needs to be pervasive (Saha and Mukherjee, 2003; Krishnan, 2010). From a technical perspective, Weiser’s vision has largely been realised, with today’s smartphones, tablets and digital whiteboards / smart TVs corresponding directly to his imagined “tabs”, “pads” and “boards” respectively. Ubiquitous computing now allows environments, vehicles and wearable computing to collect data via sensors – the “Internet of Things” (IoT), which enables context-aware computing (Abowd et al., 1999; Eliasson, Cerratto Pargman and Ramberg, 2009). But what of the interaction perspective? As an answer to this question, the concept of Human-Data Interaction (HDI) emerged. This sub-discipline of HCI outlines the vision that the human needs to have a direct, explicit relationship with their own data (Mortier et al., 2013, 2014), and that personal data should be considered an entity in its own right; people do not just need to interact with systems, but with the data itself. This can be seen as an echo of previous calls throughout the decades for a new relationship with our stored knowledge (Bush, 1945; Lansdale, 1988; Rogers, 2006; Hendler and Berners-Lee, 2010; W. Jones, 2011a).
Mortier et. al. laid out three tenets of HDI: Individuals need to have agency over how their data is used within the system, the data needs to be legible (i.e. understandable) to us, and we need negotiability - the ability to flexibly adapt and make use of the data. HDI has remained a small but important research niche within HCI, and many researchers continue to explore this field today (‘Human Data Interaction Project at the Data to AI Lab, MIT’, 2015; ‘HDI Network Plus, University of Glasgow’, 2018; ‘HDI Lab, Heerlen’, 2020; BBC R&D, 2017), as does this thesis. In order to understand what HDI might mean in practice we can look to Gregory Abowd’s 2012 paper which aims to update Weiser’s vision. In it, Abowd emphasises the importance of programming for environments, building a complete experience for the individual that considers not just the 2D screen they are using, but the entire surroundings and context of their environment. He imagines a hybrid, conjoined experience between people, devices, sensors and the cloud where data storage and processing need not be constrained to the input and output devices we use (Abowd, 2012) and crucially, that the individual within this “everyday computing” experience is harnessing technology for their own ends, not just being aided to complete a predetermined task (Abowd and Mynatt, 2000) – in essence they are able to program their own environment.
Abowd’s vision is a helpful reference point to remind us how far from true human-data interaction we are today. As described above, data is trapped, and very few computing interactions today are designed as a situated experience. Some TV streaming services show a good example of an interaction whose design has taken into account context: instead of typing in long email addresses and passwords, difficult on a TV remote, you can visit a short link from a smartphone or PC where you are already authenticated. But even though there are pockets of research around contextual experiences (for example the work around second screening (T. Jones, 2011; Zúñiga, Garcia-Perdomo and McGregor, 2015)), in general most design work today still focuses on a single interaction surface. In order for technology to disappear into the background so that we might live in a calm, engaged manner, as outlined by Weiser and expanded upon by Yvonne Rogers (Rogers, 2006), a more humane interface is needed (Raskin, 2000), one which designs for the whole person. Judging the success of a user interaction can no longer be done by assessing task-completion efficiency (Abowd and Mynatt, 2000) but should consider the holistic needs of the individual at that moment in time.
Yet in the 2010s, there was a growing recognition that the world had lurched severely away from such goals. The design of information-consumption interfaces was having a detrimental effect upon people, not just in terms of the psychological impacts of information overload as detailed above in section 2.2.4, but also in terms of the impact on users’ attention. This would become known as “the attention economy” (Simon, 1971; Croll, 2009; Cogran and Kinsley, 2012; Brynjolfsson and Oh, 2012). Social media technologies like infinite scrolling and smartphone notifications had created “a culture of perpetual distraction” (Timely, 2020) which “hijacks people’s minds” (Harris, 2016). As Zeynep Tufekci put it in her TED talk, “we are creating a dystopia just to make people click on ads” (Tufekci, 2017). In 2013, Tristan Harris released a presentation calling on the tech industry to respect users’ attention and minimize distraction (Harris, 2013a), which lead to the creation of the Center for Humane Technology (Harris, 2013b), a central group in this new movement to design for positive human values and to practice value-sensitive design (Friedman and Hendry, 2019). This focus beyond just supporting data interaction to understanding and enhancing the individual’s lived experience can be seen as a central guiding tenet of Human-centred design.
We can see from the above that the design of human-centred personal data interaction is not purely a matter of designing better user interfaces, nor even of designing for the user’s physical environment, but in fact a design challenge that exists at the sociotechnical (Bunge, 1999; Murton, 2011) level – it must take into account the social relationships of the individual (as detailed in 2.2.6) as well as the power imbalance that exists between data holders and data subjects (as detailed in 2.1.2). Andy Crabtree recognised the sociotechnical nature of the HDI challenge in his 2016 paper with Mortier on ‘The Shifting Locus of Agency and Control’ and highlighted particular aspects of this multi-party challenge around personal data, specifically being able to ensure the privacy of your data as well as the accountability data subjects require over data-processing algorithms and data-handling organisations (Crabtree and Mortier, 2016). These goals are now actively pursued through research into privacy by design (Cavoukian, 2010) and Critical Algorithm Studies (Gillespie and Seaver, 2016) respectively. In his subsequent work with Peter Tolmie, Crabtree focused on the particular HDI challenges around data-sharing, which must also be designed for (echoing Lindley’s work on file biographies mentioned earlier) (Crabtree and Tolmie, 2018). These areas of pursuing a human-centric agenda within a sociotechnical context continue to be areas of active research today, as seen in projects such as Nesta’s DECODE (Symons et al., 2017), which focuses on individual empowerment, and UKRI’s not-equal.tech (Crivellaro et al., 2019), which focuses on data justice (Taylor, 2017).
During the 2010s, while many were focused on the utility of PIM systems (as described in 2.2.2 above, and hereafter referred to as “traditional PIM”), some researchers, thought leaders and strategists were developing ideas that can be seen as the first socio-technical designs for personal data interaction. One of the earliest was Doc Searls, who launched a project called ProjectVRM with colleagues at Harvard University around 2008. He envisioned a model he called Vendor Relationship Management (VRM) which can be seen as the inverse of Customer Relationship Management (CRM) where organisations use data to profile and learn more about their customers and get their attention (Searls, 2008). In essence, the vision (expanded in his 2012 book (Searls, 2012)) was to combat the attention economy by turning the world of commerce inside-out; individuals would publish tightly controlled personal data about themselves and their needs, and retailers could respond to these individuals with product offers, from which (s)he would then select.
Taking a more technical slant on similar ideas, David Siegel outlined a vision of a personal data interface that would allow the ideas of VRM to be realised. He called this a Personal Data Locker, though the equivalent terms Personal Data Store, Personal Data Vault (PDV) and Personal Data Services are also used. The concept is explained in his book (Siegel, 2010) and video (Siegel, 2009). He also coined the term Pull-centric Computing (where information is ‘pulled’ at your request rather than being pushed upon you). The WEF’s Rethinking Personal Data project (mentioned earlier) describes the potential for a personal data ecosystem (PDE) of “commercial entities, acting as trusted intermediaries, exchanging assets on behalf of individuals, following a clear set of principles and legally binding contracts” with the PDV being the technical means to place the individual at the centre of that ecosystem, the PDV provider would be “an intermediary collecting user data and giving third parties access to this data in line with individual users’ specifications” (Hoffman, 2010). A 2010 report by nonprofit Mydex helps to contextualise the PDV, explaining that the PDV is a service to the individual that positions “individuals as information managers” at the “epicenter of a new ecosystem of PIM services” and that it will not just give access to data but “transform relationships between individuals and organisations” (Mydex CIC, 2010); this to me is what substantially differentiates the PDE from traditional PIM systems - it is a response to the sociotechnical need outlined in the previous section. A 2012 report from Ontario’s Information Privacy Commissioner notes that the PDE collides with traditional concepts of ownership when it comes to data, that the PDE needs to “provide a collection of tools and initiatives aimed at facilitating individual control over personal information” wherever it is located; this is another way in which PIM within PDE can be differentiated from traditional PIM (Cavoukian, 2012).
It was against this landscape that Personal Information Management Services (PIMS1 ) became a business area in its own right, the basis for a personal data economy. PIMS is attempting to create a market for “tools that help individuals gather, manage and use personal information to make better decisions and manage their lives better”, with a potential market value (in the UK) of £16.5 billion, more than the automotive and pharmacetical industries (Ctrl-Shift, 2014). In 2016, a global network and non-profit initiative called MyData was founded, bringing together researchers, companies and public sector agencies in the PDE space, in pursuit of a “fair, sustainable and prosperous digital society, where the sharing of personal data is based on trust, and relationships between individuals and organisations are balanced” (MyData.org, 2018). An important aspect of MyData is its aim to combine companies’ needs for data with individuals’ digital human rights. Through analysis of principles of PIMS, VRM and other related spaces (‘MyData Comparison of Principles document’, 2017), the MyData declaration was produced, outlining a detailed vision for the PDE space to “empower individuals with their personal data, thus helping them and their communities develop knowledge, make informed decisions, and interact more consciously and efficiently with each other as well as with organisations.” (MyData, 2017) MyData now has over 700 parties involved worldwide and provides a focal point to the PDE community.
The MyData declaration identifies data controllers’ transparency with data and data-handling practices as an essential means for individuals to gain agency and accountability, and puts forward the idea that the individual should be the point of integration of their own personal data ecosystem; in other words, “everything goes through me”; this is the embodiment of the human-centric ideal of individual empowerment but will also be a good way for data controllers to ensure awareness, accuracy and consent. They also introduce the idea of a personal data operator (also known as a data trust) which is a key part of the personal data ecosystem - a trusted third party which stores or transfers data on behalf of the data subject, but does not use it themselves. An example operator is digi.me, which has developed a PDV with a “private sharing” model that allows users to allow subsets of their data to be used by external organisations or apps with strictly controlled parameters (Firth, 2019). The MyData/PDE space is very active currently, with many emerging businesses and startups having appeared in the last two to three years. Citizen.me (‘Our Values’, no date) is another company with a similar positioning. Other operators such as UBDI (‘Whose data is it anyway?’, 2019) and datacy (‘About Us’, no date) are positioned under a different business model which aims to help individuals take control of their personal data for profit. Open Humans has a PDV optimised to allow people to share their data for the benefit of research (Price Ball, no date). Ethi is a PDV platform focused on providing individuals with deep insights from their data, and tools to more easily delete your personal data from data-holding organisations (Jelly, 2021).
In this section, I have shown how the emergent human-centric personal data ecosystem has developed from its roots in HCI, ubicomp and HDI. The call for designs and sociotechnical systems that empower individuals with their personal data arise from the power imbalance (Hoffman, 2014a) that has emerged as a result of the datafication of modern life. In the third wave of HCI (Bødker, 2015), user interface design’s main consideration was “what does the user want to do?”. Over the last decade, catalysed by the shift by the explosion of Internet culture and the shift from self-install software products to massive-scale cloud-based Internet services, there has been a gradual but perceptible shift away from the tenet that the user’s needs should come first: the designs of commercial and civic web applications now more reflect the question (considered from the provider’s perspective) “What do we want the user to do?”. Users (people) and their individual needs have been left behind. The MyData community have clearly outlined the goals to address this problem, but much of the focus at present is on technology questions of how to build better PDVs and better PIM interfaces, or on indentifying an effective business model that will facilitate the transition to a PDE, which is a necessary but distracting question. My research is situated at the bleeding edge of this emerging human-centric personal data ecosystem and being non-commercial, is able to take a more purist human-centric stance. After uncovering the human experience of personal data (as detailed in 2.1.5) and the lived experience of personal data usage within people’s wider digital life and relationships (2.2.5), I will seek to address a third research gap - to understand the technical, legal, policy, economic and social realities of the PDE landscape itself, sufficient to inform the design of PDE processes and systems. Thinking of the barriers cascade in the SI space (Li, Dey and Forlizzi, 2010), what barriers exist that inhibit the building or adoption of PDE human-centric technologies? What opportunities might make it easier to overcome these barriers and to catalyse progress toward the human-centric agenda as envisioned in the MyData declaration? What are the key challenges faced when we attempt to build human-centric technologies in today’s world? By applying learnings about human experiences and attitudes to the data-centric world to the practice of PDE design & development, can we more clearly map the road ahead and define a research agenda for the next step of tackling the PDE challenge?
By adopting both a participatory design and technical strategist’s standpoint throughout this thesis, building on the theoretical foundations of effective data access, information management and human-centric data interaction, I aim to progress PDE / MyData thinking, using methods detailed in the next chapter, in pursuit of my primary research question, which is:
“What role should people’s data play in their lives, what capabilities do they need, and how could these ideals be achieved?”
In the previous chapter, I described three research areas this thesis seeks to explore: how people think about data and what they want from it, how data fits into people’s relationships with organisations and how they want it to be used, and how could people’s desires for the role data plays in their lives be brought closer to reality. In this chapter I will explain my approach to conducting research in this area, detail the types of methods used, and explain how the different research activities I carried out contribute to those three research aims.
To develop a research paradigm it is important to begin with reflecting upon your outlook on the nature of reality (ontology) and your beliefs on how knowledge of that reality is formed (epistemology) (Guba, 1990). It will already be evident from the literature review and the framing of this thesis so far that individual human perspectives are at the centre of my research questions. This is a reflection of my ontological stance which is that everyone experiences their own reality, informed by their own concepts and mental models of the world. This is known as constructivism (Guba, 1990), where new knowledge is formed by developing one’s own mental models in order to explain new experiences, as distinct from the positivist view that there is a single universal reality that needs to be uncovered. However, in parallel to this individual learning through experience, people’s realities are constantly shifting and changing, especially when it comes to the rapidly changing technological landscape we live in today reality – consider that today our reality now includes concepts that did not exist in our youth, from “feeds” and “posts” to “link sharing”, “syncing” and “blocking”. As new technologies and practices emerge, we develop new mental models to help us make sense of and find value in new capabilities. This idea of reality as something constantly renegotiated by the individual is known as pragmatism (Campbell, 2011). To me this is an overriding truth about reality and this focus on understanding change, as perceived by individuals, is a key research motivation. Where constructivists may focus more upon deeply understanding an individual’s reality at a moment in time, I am more interested in understanding the ways in which people’s understanding of the world, and of themselves, changes as a result of their lived experience. At this point we must consider the individual’s motivation for constructing and pragmatically changing their concepts of the world, and to understand this we can look to objectivism (Peikoff, 1993), the philosophy put forward by Ayn Rand, which is a belief that the mind, informed by the senses, is the means by which we discover truths about the world, and it does so by forming concepts and using inductive reasoning (Smith, 2011) (in essence, “if these things are true then what else must be true?”) to acquire knowledge. In essence, people’s conceptions of reality are constantly tested and re-evaluated by their experiences of the world. Objectivism also states that individual’s motivation in life is the pursuit of one’s own happiness and wellbeing, and that this self-interest is what drives his pursuit of deeper knowledge and understanding about the world; in essence, everyone wants to improve their own life, and they need knowledge to do it, and for me this view of understanding the nature of reality, so that one might be able to change it for the better is also a key driver behind my research. As a final philosophical element to incorporate, I also look to Deweyan pragmatism, which states that our knowledge and thinking are tested by actions, not just reason, and that this is how we learn - and that communication and interaction with others is a key part of that learning. Dewey recognises that every individual is not solitary, he exists within a society; he “is a social being, a citizen, growing and thinking in a vast complex of interactions and relationships.” (Dewey and Archambault, 1964) People create systems and meanings through those interpersonal interactions – which they can then use to understand everyday life; this is particularly important in the social world, as unlike the physical, natural world, many concepts are abstract and subject to individual interpretation.
My established ontological stance, then, is that individuals construct concepts, and continually update them through sensory experience, action, social interaction and inductive reasoning in order to maintain a pragmatic knowledge that they can practically apply in society and in the world in order to pursue their own happiness and self-interest.
Based upon this, we can now look to epistemology: how can knowledge be acquired? Having a constructivist rather than a positivist stance means that this is best done not through direct observation of the world and empirical testing of hypothesis, but though interacting and communicating with with individuals so that we can interpret how they view reality; this is known as an interpretivist epistemology. Most of the techniques used will therefore be qualitative (understanding perspectives and collecting non-numerical data) rather than quantitative (measuring behaviours and collecting numerical data). The focus of my research is to acquire understanding of people’s views and mental models around data and digital living, so that I can further these concepts in order to develop theories - powerful explanations that can be understood and benefitted from by ordinary people - to fill the knowledge gaps in existing research that I have identified. Given my strong focus on pragmatism and interpreting people’s constructed social realities in terms of practical usefulness to them, I will not be deeply analysing their words through language analysis techniques like discourse analysis, but will instead focus on the social, interpersonal level - understanding how people navigate the world of data and data-based relationships and change their understandings as they seek to achieve their goals in practice; and how they are affected by the systems, relationships and society they exist within. It is this practical focus, recognising that within a society there are objective truths that will affect all individuals that means the methods used will not be solely qualitative, but rather a mixed methods approach where I will adopt the most appropriate methods, usually qualitative but sometimes quantitative, as appropriate to the particular research context and question being explored.
As we move away from general research approach to the specifics of this study, it is important to be clear about what it seeks to achieve. The purpose of the research is to formulate theories that can facilitate change - to map out a research and development agenda that might help the the world to move from a data-centric (see section 2.1) to being human-centric (see section 2.3) operating paradigm. By learning about people’s understandings of their reality, this will inform my own thinking, and using by an inductive research approach we can identify patterns common to multiple people and form theories that might explain these patterns. As a student of digital civics (Vlachokyriakos et al., 2016) I believe that research can surface the ways in which current service provisions fail to meet people’s needs, and through research we can show how the world might better empower citizens if it were configured differently with services closer to what they desire. The role of the researcher is to understand the world and to figure out how to change it. It is an accepted view that research cannot be value-free, but in fact we can go further, the researcher can be an activist, seeking to correct an imbalance in the world through their research. As such, the design elements of this research can be considered as political, this is adversarial design (DiSalvo, 2012) and I view this as necessary to counterbalance the strong forces outlined in Chapter 2 that are acting against individual interests; by creating space to reveal and confront power relations and influence, we can identify new trajectories for action (DiSalvo, 2010). Therefore the purpose of the research is to inform myself as adversarial designer, with the acquired insights from the experiences of research participants helping me to develop my own understanding, models and designs.
When designing for people and trying to incorporate their views, there are traditionally two schools of thought: user-centred design (UCD) and participatory co-design (PD). In UCD design is carried out by experts, who have undertaken user research to build up understandings of user needs (Norman and Draper, 1986). This approach places a high value on expertise, but it carries the risk that certain user needs may be overlooked, especially those that are less common (and therefore less likely be present in a designer’s concept of ‘the average user’). UCD is the most common approach used by technology companies today, not least because commercial motives must be incorporated into designs, and therefore design can never be fully democratised. UCD as implemented in modern software development practice does however recognise the importance of representing the user perspective in the design process, and uses processes such as focus groups, user experience testing, user persona development to include their perspectives. However such perspectives may ultimately be ignored or diluted in favour of expert designs or organisational motives.
Recognition of this inherent problem - that users carry less influence than designers and that this imbalance must be tackled head on - lead to the ideas of co-creation and PD. PD is based upon the idea that those who will use or be affected by technology have a legitimate reason to be involved in its design (Kensing and Blomberg, 1998). PD is seen as an attempt to design in a more democratic fashion. PD proponents argue that it is not sufficient to study users and go away and design in isolation - instead the users and technologists work together in design workshops, with users bringing their lived experiences and perspectives and technologists bringing their expertise on technical and market possibilities and constraints (Bjerknes et al., 1987; Björgvinsson, Ehn and Hillgren, 2010; Smith, Bossen and Kanstrup, 2017) so that a collective, democratic design is created, taking into account all perspectives. In the 2000s, PD grew in popularity across public and private sector organisations, coincident with the growth of internet and social media into its “Web 2.0” phase (Hosch, 2017) which began to reframe digital technology as something to be harnessed for users’ own ends (Jenkins, 2006).
As design approaches, I see merit in both UCD and PD. The participant should play a role as an informant - one who can provide critical insights into their own perspective on a design space and help us understand how the world is to them - but also as a designer - one who can imagine how they would like the world to be. As we involve the participant, our role as the researcher is to elicit the richest possible responses from the participant, by using questions to bring them to consider new questions and by giving them stimulating materials to trigger their thinking. The researcher also often needs to sensitise the participant to a design space, so that they may properly engage with the questions being posed, but equally the researcher cannot arrive at a model or theory unless he has developed empathy for the participant’s perspective. One of pragmatism’s founding philosophers, Peirce, put forward the pragmatic maxim, which states that the meaning of anything we experience in the world is understood through the conception of its practical effect, and that theories that are more successful at controlling and predicting our world can be considered closer to the truth (Campbell, 2011). Applying this philosophy in to the challenge of design, I find merit in the different, less political, take on involving users as participants in design exhibited in McCarthy and Wright’s experience-centred design (McCarthy and Wright, 2004) framework, which emphasises the importance of understanding the user’s experience to inform technology design. It identifies six sensemaking processes users go through. These can be considered to help acquire user empathy:
Through my research I will at times be more participatory, to understand these aspects of user experience or to co-design solutions with participants, but I will at other times act more like an expert designer. Taken to the extreme, the PD view is that designs made without the direct involvement of users are invalid, because they inherently no longer represent the desires of those people the designs claim to serve. I oppose this view, because I believe that new ideas will not always arise from participants themselves, especially for this research area where a more expert-led experience-centred design approach is the most pragmatic way to proceed, because by its nature this research involves thinking about data, information, organisational relations and interaction (topics that are not often theorised about as part of everyday life) at a level which the layman is not accustomed or well-equipped to do; therefore while I strive to always include participant viewpoints, I give ultimate precedence in design to my own position of learning that I will acquire through the research I undertake with participants and which I will develop through theoretical & design work that I will undertake by myself. In doing so, I will also be a participant in my own research, incorporating my own experiences of living in a data-centric world (and my attempts to challenge it) into my learnings.
It is important to be clear about what constitutes good research in this context; if the outcome of the research is to be my own interpretations and theories, how will we know these are sound? Firstly it is important to say that this is not about measuring the effectiveness of proposed changes upon the world. There will be no deployment of systems to test the ideas I put forward. This is not because such an activity would not be worthwhile–it would–but simply because by its nature, to develop, build and deploy new data interaction paradigms that function in real life with real personal data at the sociotechnical level would be too large an endeavour for a single researcher (or even a single research group) to undertake. Therefore what I seek in this thesis is not to change the world, but to articulate with the greatest possible clarity discrete theories on how the world should, and could, be changed. Good evidence for the proposed changes will be achieved by ensuring that findings themes and discussion contributions are backed up by participant quotes, and where an idea is suggested or agreed upon by many participants or where it resonates with my own embedded experience, that can be seen as adding weight or validation to that idea. However, each person’s experience is unique and needs to be put into context; not every insight will be shared by many participants and individual unique insights remain important.
The mixed methods approach I will be adopting closely follows the discipline of participatory action research (PAR), which is an approach to research that encompasses both the involvement of participants’ perspectives while also retaining a role for the reflection and learning of the researcher themselves. PAR’s creator Kurt Lewin observed that “there is nothing so practical as a good theory” (Lewin, 1951) which shows the pragmatic nature of this approach. PAR combines self-experimentation, fact-finding, reasoning and learning, and makes sense of the world through collaborative efforts to transform the world rather than just observing and studying it (Chevalier and Buckles, 2008). Central to this is the idea that research and action must be done with, not on or for, people; participants are not subjects but co-researchers, evolving and addressing questions together (Reason and Bradbury, 2001). To embody the three ingredients of PAR (Chevalier and Buckles, 2019) – participation, action, and research – my research will include three types of activity:
Action research also carries with it the idea that research is done in cycles: you learn something, carry out some action in the world based on your learning, learn from what happened, and repeat. This has become an established approach in HCI research (Hayes, 2011) and the importance of collecting stakeholder feedback at regular intervals is also seen in the software industry though agile development (Fowler and Highsmith, 2001) which can be seen as a practical implementation of action research. In startups, terms like ‘fail fast’ (Brown, 2015) and ‘pivot’ (Ries, 2011) illustrate the idea that it’s crucial to test ideas on real people then adapt quickly based on how that goes. To me, action research does not mean that you must test every single idea with an audience for it to be considered valid, but rather that user engagement is not a one-off, but a repeated component that affects the research path. Each new research activity will draw from your past learnings and theories and your acquired understanding so far, which will be further developed through its exposure to ‘real life’ in the process of participatory and embedded research activities.
Figure 3 shows the cycle of action research, as I will apply it in this study. In each area of life or context that I identify as a setting for a research activity, I will first carry out initial background reading, experimentation or exploration to familiarise myself with the area, then I will design a research activity that helps to explore my research question in that area. After carrying out the planned activity (be it participatory, self-experimentation or embedded research) I will analyse any data from that activity (or just reflect upon my experience), and then use these findings to update my overall understanding of the answer to my research questions. I will then go on to repeat this cycle, with the next study, but beginning with more developed theories or understandings than the previous. In the case of embedded research activities these are likely to go on for several months alongside other activities, so analysis and learning will happen throughout, resulting in a continually updating current understanding that will form the baseline for later research activities. In the next section I will describe the three specific research objectives that will be targetted through the research activities.
At the end of chapter 2, I introduced my research question, which is:
“What role should people’s data play in their lives, what capabilities do they need, and how could these ideals be achieved?”
Corresponding to the three research gaps I am focusing on as identified in 2.1.5, 2.2.5 and 2.3.5 respectively, there are three distinct subquestions I will explore using the approach detailed above. Each of my research activities will be designed to advance my understanding and theories towards at least one, sometimes more than one, of these three research objectives:
As established in section 2.1, personal data, and its collection and use by commercial and civic organisations, is an established and inevitable part of modern life, yet the concept of data is abstract and poorly understood. The first strand of research I will be advancing through this thesis is to establish a solid understanding of what mental models people have constructed about data. We need to understand what makes data meaningful to people, and given HDI’s belief that everyone needs a relationship with their data, we need to understand what relationship people currently have with their data. What is data to people? If we are to design new human data relations, we must begin by understanding people’s current relationship to their data, the ways in which that relationship affects them, and their unmet desires for improving their relationship to their personal data. We need to find out what aspects of data cause positive emotions, what problems do people experience with their data, and what people want from their data.
In order to approach this objective, we must take a participatory approach; gathering individual perspectives on data, and looking for patterns or trends in those perspectives, will be the primary means to advance this research objective. The first challenge here will be to find ways to sensitise participants to be able to conduct an informed and productive conversation about the topic of data, which to the layman may seem a dry, boring topic. This challenge will be addressed by leading participants into the subject of data using meaningful representations of data as stimulus for conversation, or starting with the individual’s own life experience to discover the data in their life, which they are more likely to have opinions and emotions about, rather than talking about the subject in the abstract.
In section 2.2 and 2.3, I established that as of yet, designers of PIM and personal data interfaces have not yet risen to the socio-technical challenge of looking at the reality of personal data today: that it is scattered, inaccessible and largely unusable. There is no way for people to view their data holistically, nor any tools to help people manage the many relationships that individuals have with companies, employers, councils, governments and other organisations that rely heavily upon the collection and processing of their personal data. Almost every civic or commercial service we use today handles our data. We know that the world is data-centric, and that data controllers use data as an asset to inform their decision-making, creating a serious imbalance of power (Hoffman, 2010, 2011, 2013, 2014a, 2014b). But what is like to conduct a relationship with an organisation that holds your data? What emotions do people experience? How does it affect their daily life, and what sort of problems do people face as a result of this data-centricity? If your data is used in ways you do not understand or consent to, how does this affect your outlook on the world? This is the second strand of research I will be exploring: to gain an understanding of the data world beyond the individual, so that we can design not just better individual relationships to one’s data, but improve people’s relationships with organisations that hold and use data. (Note: for the purposes of this study, we only pay attention to service relationships, not social or interpersonal relationships). In this thesis and its title I use the term “human data relations” to encompass both of these aspects - human-data relations (the individual’s relationship to their data, as imagined by HDI), but also human data relations, i.e. human relationships that involve data.
To tackle RQ2, participatory research approaches are appropriate here, as our questions relate to the individual mental constructs that people have about their wider digital lives and relationships. But there is another aspect here, and that is that a relationship involves two parties. Consistent with Dewey’s belief in the importance of interaction in creating meaning, the structualist philosopher Michel Foucault said that “meaning comes from discourse” (Adams, 2017), in other words people do not construct their reality in isolation, but in fact it is shaped by the social constructs and systems they operate within. Deweyan pragmatism also takes the view that research must seek solutions to real world problems that are generalisable to use in society at large (Dewey and Archambault, 1964; Friedman, 2006). This implies that any such solutions arising from my research must work for all parties. For both these reasons, I will conduct participatory research to understand both perspectives: that of the data controller and that of the data subject, and where possible I will engage both parties together in discourse so that the two parties’ worldviews can be brought together to design solutions that could work in practice for all involved.
This second research objective will be tackled in tandem with the first, so that in each research setting we can examine the situation at two levels - to look introspectively at the individual’s own relationship in service of RQ1, but also to take a step back and look at the wider social context the individual is operating within so that we might be better placed to answer RQ2.
As a software industry professional, and as a pragmatic digital civics researcher, I believe it is important that the outcome of my research is not purely theoretical. While the goal of this PhD is not to build a new data interaction system, it is important that we pay attention to how the problems outlined in section 2, and the individual desires and needs we uncover in RQ1 and RQ2, might be achieved in practice. This involves gaining understanding of the technical, economic, political and legal landscape that personal data interaction occurs within. This involves gaining clarity on the motivations that service organisations have for being data-centric, and understanding the current systems and organisational practices that influence current system and process designs. Just as Li showed that users of SI systems experience a barriers cascade as they try and achieve more human-centric data goals (Li, Dey and Forlizzi, 2010), it follows that there are also likely to be a series of obstacles that service organisations would have to overcome if they were to approach these goals. We need to uncover these obstacles so that we can design approaches to overcome them. The third strand of my research is to outline practical steps and guidance, both for researchers and personal data interaction system developers, to make it clearer how they can pursue the goals we identify for improved human data relations.
This strand will be addressed in parallel to RQ1 and RQ2, so that practical discoveries may inform those research questions too. This also means that as new needs and desires emerge from RQ1 and RQ2, they can become “requirements” for the more technical design work of RQ3. As an approach, this will be action research in its purest sense - I will embed myself in projects working in the personal data space, as a developer and a researcher, so that I can gain deep field experience of the constraints and opportunities that affect the design of data interaction systems and processes. Unlike RQ1 and RQ2, this strand of research will be explored not through strictly configured study research engagements but rather through a process of acculturation to the world of building data systems and developing my own knowledge through design, technical prototyping and pushing the boundaries of the systems that do exist so that they may be better understood. Ultimately these insights should allow me to achieve greater expertise, backed by the empirical findings from RQ1 & RQ2, to allow me to draw conclusions about how I believe the discipline of human-centred data relations should proceed in its future research and development.
As explained in the last section, the three sub-research questions RQ1, RQ2 and RQ3 have been addressed in parallel throughout this research. They can be considered as three parallel trajectories of research and learning, each informed by some or all of my research activities as they progress, in cycles of action research as described in section 3.2 above. Figure 4 shows these three parallel research objectives as downward arrows. Considered as three areas of understanding, RQ1 can be seen as understanding personal data, RQ2 as understanding data in relationships, and RQ3 as understanding how to reconfigure data interaction in practice. Figure 4 also illustrates how the three contexts of study and three major case studies, which I will explain below, contribute to advancing my understanding of each area - with the positioning of the box over an arrow indicating that it contributes to that area of understanding.
The first research context I explored in this PhD was “Early Help”. This is explained in detail in Chapter 4, but in brief: Early Help is a particular type of social support offered by UK local authorities as voluntary help to families who are considered to be at risk of falling into poverty, crime, truancy, addiction or other issues which are both problematic for the individuals and costly to the state. Families enrolled in the scheme meet a social worker (called a ‘support worker’ in this context) regularly who can provide advice and connect the family with appropriate health, lifestyle and social services to their needs. As part of this, the support worker has access to a variety of data from civic sources: school records, employment and benefits data, social housing data, criminal records, and more, so that they might be better informed about the family’s situation. However the families do not have any access to this data, and thus despite this being a scheme that is on the face of it intended to empower families to help themselves, it runs the risk of disempowering the families through the same data-centric power imbalance described in section 2.1.2. Therefore, this setting provides a very interesting context in which to examine both RQ1 (finding out how these supported families feel about their data) and RQ2 (examining the impacts of data use within a service relationship) as well as to explore how the families and support workers could imagine their data relations being improved.
Within this context I carried out three research activities between 2017 and 2019:
From March 2017 to March 2019, I joined Connected Health Cities’ “SILVER” project (Connected Health Cities, 2017) as a part-time research engineer alongside my PhD. This research project was funded by the UK’s Department for Health (now the Department of Health and Social Care) and brought together local authorities, health authorities, University researchers and technology partners in the North East of England, in exactly the Early Help context described above. Its goal was to explore how to unify civic data about a supported family, with their consent, to allow support workers to provide better care to those families. This made it an ideal place to explore my research objectives: Because it was aiming to build a real-world technical solution, this would provide practical insights that would serve RQ3, and as it was also using direct research with families and support workers to inform the system requirements, this would also provide an opportunity for deeper understanding of the use of data within the Early Help support relationship (RQ2), and both parties attitudes to this highly personal and real civic data (RQ1). My role was two-fold: as a software engineer, to design and develop user interfaces that would be used to view this unified data, and as a participatory researcher, to assist with the design and execution of focus groups and workshops with staff and supported families that could inform the proof-of-concept data system being built. This embedded placement is not considered a major case study of this thesis, however it has contributed to the research objectives and the developing understandings of this context so will be referenced in the subsequent chapters, especially Chapter 4 and Chapter 7. Chapter 7 includes a short section [ADD REF TO CHAPTER 7 SUBSECTION] detailing my high level observations from participating in the project. The final report from the project is available at [ADD REF HERE WHEN AVAILABLE].
In the summer of 2017, in the MRes year of this doctoral training programme, I carried out an initial participatory field study in order to deepen my understanding of data use and attitudes within this context (RQ1) and develop appropriate research methods. This study consisted of home visits to four different families in the North East who had interacted in the past with social care & support services. During the course of these two hour visits I carried out participatory co-design activities and interviewed the families (both adults and children) about their civic data, and in particular their views on how risky different types of data were and how that data should be handled. While this fieldwork took place prior to the start of this PhD, the data analysis and publication of the findings took place within the scope of this PhD. Again, this is not considered a primary study for this PhD, but will be referenced within this thesis. The paper which published the study is (Bowyer et al., 2018), which is included in [ADD APPENDIX REFERENCE TO CHI2018 PAPER HERE].
In the summer of 2018, informed by the SILVER project and the Understanding Family Civic Data study, I designed and conducted my first major case study of this thesis: a series of three participatory co-design workshops with people directly involved in Early Help relationships in North East England. The workshops were funded by CHC and conducted by myself and were designed with a dual purpose: to inform the design of the SILVER system but also to serve RQ1 and RQ2 of this thesis. These workshops built upon the Understanding Family Civic Data study, in order to validate the earlier findings – but aimed to develop a deeper understanding of what supported families (workshop 1) and support workers (workshop 2) perceive as problems with data use in the Early Help context and to explore perceived solutions to these problems. The third workshop was specifically designed to focus on the use of data within the support relationship, and was a joint workshop involving staff and parents working together. This case study is described in detail as Chapter 4, and contributes to the general findings about RQ1 and RQ2 presented in Chapter 6.
From the start, a core motivation for my interest in this research has been to look at the power imbalance around personal data from the “everyday life” perspective - to explore our relationship with and through the data that we hold, use or live with as we go about our lives, online and in person. It seems that this power imbalance is something that touches everyone, and therefore for my second research context I chose not to focus on a particular community or group but to look at these problems at the level of our day-to-day digital lives. I designed research activities where I would talk to people about their everyday experiences of data in their lives (RQ1) and their views on the usage of data within their relationships with commercial or civic service providers (RQ2). In 2018, during this PhD, the European Union’s GDPR regulations came into force, enabling people to obtain copies of their own data. This enabled me to take the research deeper than a simple conversation and to guide my participants through the GDPR process to obtain their data from providers, and then to use this retrieved data as a stimulus for discussion; this I hoped would result in a far more grounded and less theoretical perspective. In parallel to this, I was began to conduct my own experiments using GDPR to see and explore my own data. This allowed me to sensitise myself to the research space, and to enhance my understanding of RQ3 (finding out more about what is and is not possible in practice when it comes to everyday personal data access) but also crucially it enabled me to become a participant in my own research, enabling a deeper understanding of this research context.
Within this context, I carried out four research activities between 2016 and 2020:
This early study was carried out in late 2016. Its goal was to deepen my understanding of people’s perceived values around everyday technology use and to validate some of my own perspectives. Using participatory interviewing techniques I explored attitudes to smartphone use, with particular attention to perceived usefulness or barriers. This was designed to provide background on what motivates people as users of technology, an important consideration when looking at disempowerment. The thematic findings from this study are detailed in a report in [INSERT APPENDIX REFERENCE HERE].
In order to further acclimatise myself to people’s attitudes to data and to provide balance to my own attitudes and opinions, I conducted 5 two-hour interviews with individuals about their digital lives, looking at how they mentally segment their life, and the roles and functions of different technologies, and especially of data, across those different parts of their lives. As part of this I also explored the participants’ perceptions of their relationships with service providers, in order to identify the ways in which individuals might feel disempowered by the ways their data was handled or to identify what they would like to change about their data relationships. The interviews were conducted using the Sketching Dialogue (Hwang, 2021) technique, which uses collaborative sketches as a basis for a semi-structured interview. A light summary of observations and findings are presented in [INSERT APPENDIX REFERENCE HERE].
As preparation for Case Study Two, and in order to increase my own empathy and participation in the research, I have throughout the last three years from 2018 made numerous efforts to obtain my own data from companies and organisations in my own life. This has entailed over 70 GDPR requests to a variety of organisations including retailers, device manufacturers, online service providers, local and health authorities, banks and leisure services. Additionally I have experimented with self-service download dashboards and third party ‘get my data’ tools. In some cases I have engaged providers in communication to try and get better data or ask questions about my data. These activities have provided multiple benefits: they have enabled me develop a detailed understanding of what actual stored personal data looks like (which informs RQ1), they have given me an awareness of the evolving response to GDPR from data controlling organisations (which informs RQ2), and has allowed me to test the limits of what is and is not possible with GDPR (which informs RQ3). A summary of observations and findings are presented in [INSERT APPENDIX REFERENCE HERE].
As described above, the major study for this context was to guide participants through the process of GDPR and retrieving their own personal data, to enable a conversation that included not only attitudes to personal data and the use of data within service relationships, but discussion of how those attitudes were changed by the experience as it happened and how well expectations and hopes were met by the process. 11 participants were engaged 1-on-1 in a 4 to 5 hour process over a series of months which involved five stages:
Through these stages the objectives were to understand how people view the data that exists about them as they go about their everyday life and what they would ideally want from it (in service of RQ1), as well as what role data plays in their relationships with companies and other data-holding organisations in their lives, and what they would ideally want from those relationships with respect to data (in service of RQ2).
In the final data exploration interviews, which were conducted online over Zoom due to COVID-19 restrictions, a spreadsheet-based approach was used, where participants were walked through a series of Yes/No questions about different categories of their data, and then asked to expand verbally on their reasoning. This produced both qualitative and quantitative data for later analysis.
This case study is described in detail as Chapter 5, and contributes to the general findings about RQ1 and RQ2 presented in Chapter 6.
The third context for this PhD, which has remained a focus throughout, is a more practical one; to go beyond just understanding people’s perspectives but to look, in the context of what we learn about people’s desires for their data and their relationships, at what is currently possible in practice. The goal is to find out what factors shape the design and implementation of real world data interaction systems and processes, to understand what legal, social, economic, technical or political factors come into play and importantly, to explore what technologies or techniques might be able to pursue human-centric design goals in a data-centric world. In scope, this context is a broad one, encompassing all forms of personal data interaction; as such it is able to draw on the findings of RQ1 and RQ2 from the first two contexts, viewing those as “needs” or “requirements” that would ideally be met through the designing and building of new interfaces.
In total four separate research activities between 2017 and 2021 took place within this practical research context:
The embedded role I took in the SILVER project described in section 3.4.1.1 contributes also to this context, as part of my role was as a front-end software developer for a personal data health interface intended for use by support workers in the Early Help context. Learnings from that experience also helped to serve RQ3. This aspect of the SILVER project is considered out of scope for this thesis, though reference is made to it in Chapter 7.
As a software developer I have been aware for a long time that one of the biggest challenges in building new data interfaces is to gain programmatic access to the necessary data. As part of the trend towards cloud-based services and data-centric business practices, it has become increasingly difficult to access all of the data held about users by service providers. Application Programming Interfaces (APIs) are a technical means for programmers to access a user’s data so that third party applications may be built using that data. Unfortunately, as a result of commercial incentives to lock users in and keep data trapped (Abiteboul, André and Kaplan, 2015; Bowyer, 2018), much of users’ data can no longer be accessed via APIs. While GDPR data portability requests do open up a new option for the use of one’s provider-collected data in third party applications, this is an awkward and time-consuming route for both users and developers. Web augmentation provides a third possible technical avenue for obtaining data from online service providers. It relies on the fact that a users data is loaded to the user’s local machine and displayed within their web browser everytime a website is used, and therefore it is possible to extract that data from the browser using a browser extension. Similarly, once loaded into the browser, a provider’s webpage can be modified to display additional data or useful human-centric functionality that the provider failed to provide.
In order to better understand what is and is not possible using this technique, I participated from 2018 to 2020 as a part time web developer in a project which was using the web augmentation technique to improve the information given to users of Just Eat, a takeaway food ordering platform in the UK. While this particular use case does not concern personal data, the technology being used by the project were considered highly relevant, and the goals of the research project were also human-centric, and consistent with our own research goals - tackling power imbalance of service providers in order to better serve individual needs. This research project is not detailed within this thesis, and is not considered a primary study for this PhD, but is referenced within Chapter 7. The paper which published the study is [ADD REF goffe ET AL], which is included in [ADD APPENDIX REFERENCE TO GOFFE ET AL PAPER HERE].
Within the personal data interface design context, I undertook my second embedded research activity within the PhD. For an eight month period (three months full time and five months part time) beginning in early summer of 2020, I was a research intern in the British Broadcasting Corporation’s Research and Development department. The BBC has a public remit to carry out research and development in the broadcast, media and information space, including HDI (BBC R&D, 2017), and has over 200 researchers. I was assigned to a project codenamed Cornmarket, a collaboration between user experience designers, researchers and developers which aimed to explore a new role for the BBC in extending its public service role beyond broadcasting into personal data stewardship. The main task was to develop a prototype personal data locker into which people could store everyday data including TV and music media streaming data, health data, and financial data. This provided an excellent opportunity to put all of my learnings acquired thus far for all three RQs into practice, and further deepen my understanding of RQ3 - the barriers and opportunities to actually building new human-centric data interfaces in the real world. Throughout the internship I was able to explore the problem space from many different angles - sharing my own research expertise, doing competitor analysis and background research, information architecture, data modelling, user experience and user-centred design, technology prototyping and supporting participatory research activities. This embedded research provided numerous new insights and an opportunity to iterate and develop my theories and models with BBC colleagues.
This case study is described in detail as Chapter 7 of this thesis.
In the previous sections I introduced my research approaches and the three research contexts and the different case studies and research activities I carried out. In this section I will explain which methods were used across those studies and why they were chosen.
The methods used in my research can be loosely grouped into five stages, though not every activity involved all stages:
I will now explain each of these stages, with examples from the different studies, as well as providing information about recruitment, ethics and thesis structuring at the end of this section.
As I described in section 3.2, an important first step before any research activity is to sensitise myself as researcher to the research context, which means to become familiar with relevant issues, systems and practices and increase one’s empathy for the participants. In the Understanding Family Civic Data study, this entailed a review of grey literature to identify the different types of civic data that councils stored, and conversations with colleagues and partner organisations within the SILVER project to deepen my understanding of Early Help. This same study served as researcher sensitisation for Case Study One, as through that study which introduced me to families that had had some contact with the care system, I was able to gain empathy for supported families and acquire some initial understandings of likely perspectives, before working with supported families directly; and through participation in fieldwork with support workers through the SILVER project I was able to gain empathy for the data needs of staff within the care service. In Case Study Two, my self-experiments with GDPR as well as researching privacy policies and GDPR rights provided me with similar sensitisation before engaging participants.
Participants need to be sensitised too; when planning participatory research activities such as interviews or workshops, it is important to begin the session with an activity that will acclimatise participants both to the specific area of discussion, but also to the mindset of problem solving required for a constructive conversation. This goes beyond ice-breaking to thinking about what the participants bring and lack at the start of the engagement. For example, in the Understanding Family Civic Data study, I felt that data would be a hard topic for families to engage with, so I designed the “Family Facts” activity shown in Figure 5. This required family members to consider simple facts about their lives (some provided, and some created by the family members) and discuss whether or not such a fact would be considered data, and additionally whether such a fact should be in the family’s control or that of the authorities. This served a double purpose of teaching families that data is simply “stored information about you”, while also getting them used to thinking critically about data ownership. The technique is discussed further in (Bowyer et al., 2018).
For Case Study Two, I wanted to get participants (and potential participants) to think about the data involved in their everyday lives, especially that stored by commercial service providers. So I put up a series of posters in the common room of my research lab which showed logos of companies that might store data, types of data that might be stored, information about GDPR rights, and possible uses that an individual might have for data they obtain from a GDPR request. Some of these posters are shown in Figure 6. These posters served both as a recruitment tool for the project and were also visited with participants at the start of each interview as a series of talking points to sensitise the participants.
Sometimes sensitisation activities can also serve an additional purpose of bringing disparate participants to be “on the same page”, this is known in participatory research as co-experience (Battarbee and Koskinen, 2005). An example of this is the “sentence ranking” exercise used at the start of all workshops in Case Study Two and shown in Figure 7. Here, a series of sentences were prepared containing opinions about civic data that had been observed from staff and families in earlier research, and participants were asked to rank these according to agreement and importance. This allowed me to validate whether previous findings held with these new participants, but also sensitised the participants to considering and discussing the civic data context and the problems experienced by families and staff. Since the sentences included both staff and family viewpoints, and the activity was carried out in all workshops regardless of whether staff, families or both were present, it served to establish a common set of “requirements” that would be in participants’ minds as they began the subsequent co-design activity within each workshop.
As discussed in 3.2, my research seeks to uncover individual perspectives and worldviews. The primary method that I used in both Case Study One and Two to do this is traditional qualitative interviewing - talking to people about the topic being explored. In Case Study Two, this was largely done on 1-on-1 basis (largely because of the sensitivity of dealing with one’s own personal data, and because it allowed me as researcher to get closer to the participant’s individual experience). In Case Study One, group discussions and activities were mainly used. This brought the advantage of being able to ‘prime’ a discussion between participants and then sit back into more of an observational role, which proved particularly insightful when observing intergenerational conversations between family members in the Understanding Family Civic Data study (Bowyer et al., 2018), and in Case Study One it allowed me to observe the negotiation of a ‘middle ground’ between support workers and supported families. In some cases, such as the home visits in the Understanding Family Civic Data study and some visits to council workers as part of my embedding in the SILVER project, I was able to conduct interviews-in-place (Pink et al., 2013) in participants’ own environments, which allowed for additional ethnographic observations to be made as “life happens around” (Mannay and Morgan, 2015) the participants, as discussed in (Bowyer et al., 2018).
I wanted to go beyond ‘just talking’ to achieve a deeper and more detail-oriented conversation, and so in all of my interviews and group engagements I also ensured that suitable stimuli were created to seed and progress the discussion. Given the abstract nature of the topic of data, it does not always carry a clear meaning in people’s everyday lives, so I needed to find a way to make the topic more vivid and real. Having sensitised myself to civic data as mentioned in the previous section, I constructed a taxonomy and lexicon for Family Civic Data, and created “Family Civic Data Cards” (shown in Figure 8) for use in activities and discussions. These serve as boundary objects (Star, 1989, 2010; Bowker et al., 2015) - representational artifacts that are understandable by people who come from different perspectives, providing a common vocabulary for discussion (as well as serving to enable co-experience, detailed above). Each card represents a different category of data, including a summary and meaningful examples to make them be easy to digest, yet still containing sufficient detail to stimulate thinking. The cards were designed to be bright, child-friendly and appealing to engage with. The tangibility of these artifacts was important too, they became things to think with (Papert, 1980; Brandt and Messeter, 2004) that could be used in discussions and in activities. Researchers have had success with the use of tangible objects to embody discussion concepts in order to stimulate and structure discussion, for example Coughlan’s use of a dolls’ house to explore attitudes to home energy use (Coughlan and Leder Mackley et al., 2013) or more recently Xie’s Data City which used AR-enhanced cardboard models to represent data-processing functions (Xie, Ho and Wang, 2021). Many of these approaches have their roots in Dourish’s concept of embodied interaction (Dourish, 2001). These cards were used throughout the Civic Data research in both sensitisation and card sorting (Spencer and Warfel, 2004) tasks, for example asking participants to position the cards on a pinboard according to perceptions about risk and ownership (see Figure 9), or sorting them into trays according to relative personal importance. The cards proved very effective at enabling a personal and detail oriented discussion: participants voluntarily opened up about sensitive topics (e.g. domestic violence or criminal records) raised by the cards because of their detached-but-relatable nature. The sketching dialogue technique (Hwang, 2021) used in the digital life context can also been as another application of this technique; by putting both participant and researcher’s focus upon the page, rather than on each other, it can feel less invasive, more collaborative and makes it easier to focus on details (see figure 10). Of course the ultimate stimulus for discussion about data is to view the actual data itself. Exploring data together with participants to elicit opinions and insights is a well established technique (Coughlan and M. Brown et al., 2013; Chung et al., 2016; Puussaar, Clear and Wright, 2017). This is the technique used within Case Study Two, asking participants about the data they retrieved from GDPR requests. The spreadsheet-based approach mentioned above was another example of a stimulus for discussion, and it allowed the Zoom-based interviews to retain a “gathered around the table looking at things together” ambiance despite the remoteness necessitated by COVID-19 restrictions.
In 3.2 I also introduced the concepts of participatory co-design (PD) as an additional research approach. This becomes particularly important when exploring solutions and ideals rather than understanding what participants perceive as problems. It involves bringing participants into a new mental space where they can imagine the realm of the possible, rather than just their current lived experience. Within Case Study One, PD was an important part of the research with both family and staff groups. In the early stages of a PD activity, it is important that participants are able to generate a wide range of ideas, even fantastical ones, without constraints, self-censoring or judgements. This is known as the ‘discovery’ phase in the UK Design Council’s double diamond framework. (Design Council UK, 2004). Golembewski’s ideation decks technique (Golembewski and Selby, 2010) was chosen for this purpose, as it allows participants to both select ‘ingredients’ of a design based on their own experience but also to combine them in a variety of different ways to generate novel ideas, guiding them into a previously unconsidered solution space.
After generating a wide range of ideas using the ideation decks, participants were then invited to pick just one or two ideas to develop into posters, each with three ‘features’ highlighted. An example is shown in Figure 12. This activity corresponds to the ‘define’ phase of the double diamond, where participants narrow down the options.
For the final workshop of Case Study One, where both parents and staff were brought together to explore possibilities of shared data interaction within the support relationship, I used a Storyboarding activity. Drawing from the world of film production, storyboarding is a well-established technique in participatory design (Spinuzzi, 2005; Moraveji et al., 2007). Usually it involves the participants drawing out a series of sketches in the form of a comic strip ‘telling the story’ of an interaction, encounter or activity. However as I wanted to focus on the interpersonal relations and process rather than the visual aspects of storytelling or interface design, I used a card-based approach to storyboarding, where participants selected actions from a palette of action cards representing different possible human or data interaction possibilities and annotated these with specific details. These cards are shown in Figure 13 and described in more detail in Chapter 4. The cards were designed with colour-coded borders to distinguish staff member actions (blue), parent actions (yellow) and shared actions (green), and participants demonstrated that they were confident to make their own decisions on their own action types, but to reach collaborative decisions on the shared actions.
In Case Study Three in particular, and also in the Self GDPR Experiments of Context Two and the development aspects of the embedded SILVER placement in Context One, the focus was not on uncovering individual perspectives, but on direct experimentation in the world to discover constraints and possibilities – in line with the philosophy of Deweyan pragmatism referenced in 3.1. To design a better future, we must understand the world at it is, not just as people perceive it. Another justification is that as a designer or software developer, we need not only user requirements but knowledge of actual constraints and possibilities for implementation if we are to create something that is realistic and feasible for use in the real world. With this in mind, I conducted many practical explorations of data interaction throughout this thesis. Loosely these could be divided into design activities, prototyping, and interface development.
In Case Study Three, as part of my placement at BBC R&D, I co-designed a conceptual personal data locker interface for unifying a user’s data from different sources and then partitioning it into different ‘areas of life’. Our design was mocked up visually by BBC colleague Jasmine Cox and is shown in Figure 14. Imagining and iterating on possible interface designs and user flows is an important part of the process of prototyping possibilities - some ideas seem viable until you actually try to detail them.
As mentioned in 3.4.2.3, I had been gathering my own data from GDPR requests since 2018. This ‘testing what is possible’ of GDPR processes provided valuable insights to inform both RQ2 and RQ3, but also provided me with copies of my own personal data. Within Case Study Three, at BBC R&D, I participated in ‘hack week’ as part of which explored possibilities for personal data locker interface designs. I used the data I had retrieved via GDPR and built a prototype user interface in JavaScript, shown in Figure 15, that would import data files from different parts of life and extract information that could then be used to categorise and display my own data. Doing this activity heightened my understanding of what is possible with real GDPR-retrieved data, and the complexities of dealing with it and analysing it in practice.
As a front-end developer embedded within the SILVER project, I was responsible to build a functional user interface for support workers to explore health data, illustrated in Figure 16. This provided an opportunity to put the ideas of timelines and Temporal PIM (see section 2.2.2) into practice and explore which features are most useful; the SILVER project ran an evaluation workshop of this software with support workers at a local council which provided further insights into which features are most valuable when interacting with personal data.
In order to find common viewpoints and extract insights from the many participatory activities I conducted in Case Study One and Two, I needed to analyse the qualitative data. The general approach taken was to audio record (and occasionally video record) all interviews and workshops, and to produce a written transcript of the words spoken. Digital photos were taken to capture card arrangements, rankings and other transitory choices, as well as designs, life sketches and other participant creations. While it is possible to analyse participant designs in more detail, I chose to give them the sole purpose of adding contextual understanding to conversation transcripts and did not examine them further. Field notes were captured during or soon after each engagement. Then a process of thematic analysis was undertaken. This involved examining the text of the transcripts (with reference to all relevant digital artifacts to add context), and identifying the underlying ideas, themes and opinions of the participants. Thematic coding is a well established technique in qualitative research (Braun and Clarke, 2006). I selected the Quirkos software for this purpose, as shown in Figure 17, due to it having a more visual organisation and simpler approach than the more commonly used nVivo. After initial coding of transcripts, a process of reductive data display cycles (Huberman and Miles, 2002) was used to group codes into themes which became the key findings of the data chapters 4 and 5. In chapter 7, a similar approach was used, although in this case as this was not a participatory engagement, the source text was my own captured field notes informed by design materials and other digital files created as part of the research placement.
While the participant data in Case Study One and Two was largely free-flowing and very loosely structured conversation, the structure of some activities allowed some data to be captured numerically, notably the sentence rankings and data card placements in the Understanding Family Civic Data study and the trust/power ratings and GDPR spreadsheets produced in Case Study Two. These data points were captured into Excel spreadsheets, and where appropriate analysed using formulae to produce weighted mean averages and standard deviations to help contextualise the findings. An example is shown in Figure 18. Due to the qualitative focus of my research, participant numbers were too low to seek statistically significant findings, so all quantitative findings are not intended to be representative of any population at large.
As well as analysing participant data, an important aspect of pursuing answers to the three research questions was to develop theories, models and ideas and then to iteratively develop those models over time. This was particularly important in Case Study Three, which was the place where theoretical knowledge acquired from the first two case studies collided with practical reality. As part of this process, I produced many different models of personal data and of personal data interaction. In some cases I was able to test these by discussing them with expert colleagues at the BBC; in other cases by disseminating ideas through blogs, tweets, workshop papers and lectures, a process which helps to refine and clarify ideas but also stimulate valuable discussions with interested people to gain feedback that helps develop the models further. Figure 19 shows an example of a model I was developing for unifying personal data in the PDV context while embedded at BBC R&D.
| Research Activity | Engagement | Stage or Phase | Duration | Number of Participants | Recruitment Method |
|---|---|---|---|---|---|
| Understanding Family Civic Data study | 4 x Home-based Interview | preliminary | 4 x 2 hours | 7 adults and 6 children from 4 families | Posters and Visits to Local Community Centre |
| Main study (Data Interaction in Early Help) | 1 x Group Design Workshop for Families | 1A | 1 x 2 hours | 8 adults and 9 children from 5 supported families | Selected by Local Authority Care Services |
| Main study (Data Interaction in Early Help) | 2 x Group Design Workshop for Staff | 1B | 2 x 2 hours | 36 support workers & related staff | Selected by Local Authority Care Services |
| Main study (Data Interaction in Early Help) | 1 x Combined Staff and Parents Group Design Workshop | 2 | 1 x 2 hours | 3 support workers and 4 parents from supported families | Selected by Local Authority Care Services |
Tables 1 and 2 summarises the participants involved in this research2. In Case Study One, recruitment was initially attempted using posters placed in local libraries, as shown in Figure 20 below. When this approach was unsuccessful, participants were successfully recruited with the assistance of a local community centre [SHOULD I NAME IT?] which allowed me to visit a community social meeting and talk to residents about my study. This community was located in a low income area that was known to include a number of support families; in this way we were able to access for this informative study a population very similar to that which would reach through the local care authorities for the main study, avoiding some bureaucratic obstacles which were delaying recruitment through official channels. For the main engagement of Case Study One, I was able to work with two local authorities, Newcastle City Council and North Tyneside Council, who were partners on the SILVER project, and provided suitable participants who were actively involved in their Early Help programmes. In the preliminary study and in the first families workshop of the main study (stage 1A), activities were designed to include children as active participants in the research, as is it was felt they would bring valuable contributions to the somewhat abstract creative co-design work and because it would be valuable to be able to observe intra-family conversations. The final combined workshop with staff (stage 2) however was designed to only include adult participants. This is because the focus on processes and on the care relationship itself was thought to be too boring and potentially sensitive for the children to participate.
| Research Activity | Engagement | Stage or Phase | Duration | Number of Participants | Recruitment Method |
|---|---|---|---|---|---|
| Smartphone Usefulness study | 3 x 1-on-1 interview | preliminary | 3 x 45 minutes | 3 adults | Convenience sample |
| Digital Life Mapping study | 5 x 1-on-1 interview | preliminary | 5 x 2 hours | 5 adults | Convenience sample |
| Main study (Guided GDPR) | 11 x 1-on-1 interview (Life Sketching) | 1 | 11 x 1 hour | 11 adults | Convenience sample |
| Main study (Guided GDPR) | 10 x 1-on-1 interview (Privacy Policy Reviewing) | 2 | 10 x 1 hour | 10 adults | Continuation from previous stage3 |
| Main study (Guided GDPR) | 10 x 1-on-1 interview (Viewing GDPR returned data) | 3 | 10 x 2 hours | 10 adults | Continuation from previous stage |
In Case Study Two, the digital life study, it was felt that no special population was needed, as the issues of living in a data-centric world would be likely to affect everyone equally. Therefore, a convenience sample (largely 20-40 year old postgraduate students from Newcastle University) was used. Care was taken to find an even split of male and female participants, but other than that no selection criteria was applied. The participants used for this study were thought likely to have a larger awareness of societal issues around personal data use, and greater familiarity with participatory co-design, than the average layperson, but this was considered an advantage as it would reduce the amount of sensitisation required.
In all cases4 for both case studies, participants were compensated for their time with vouchers – either online/offline shopping vouchers or in the case of the families workshop, vouchers for a family day out of the family’s choice.
All research activities referenced in this thesis were planned in advance, with interview schedules, information sheets, debriefing sheets, participant consent forms and ethics forms being completed and submitted to Newcastle University’s SAgE faculty ethics board, which approved all the studies before they commenced. Ethics paperwork is included in [INSERT APPENDIX REFERENCE TO ETHICS FORMS]. Most of the engagements were routine interviews and therefore did not require any special measures for safety or ethical reasons. It was made clear to all participants that they were free to withdraw from my research at any time without giving a reason. The following special measures were included in plans in order to satisfy ethical considerations:
Visiting private homes: In order to protect myself and other researchers from any physical risks or any accusations of impropriety, all home visits took place with two researchers present, and contact was made with a colleague before and immediately after the interviews to confirm everything was ok.
Working with children: Activities were designed to be child-friendly (not just safe, but engaging). The families workshop took place at a park with a nearby cafe and playgrounds for children, and catering was provided. Within the room, an activity area was provided for smaller children who were not directly participating to play while their parents and older siblings engaged. There was always more than one researcher present and the research team was never alone with children.
Protecting personal data privacy: In Case Study Two, particular care was taken to design ways for researchers to talk to people about their personal data without violating participants’ right to privacy. The research was positioned that the data retrieved from companies was participants’ own data, that would never be directly collected or handled by the research team, it was made clear that as researchers we were only interested in what was said, not the data itself. Initially a privacy monitor was developed which could only be seen with viewing glasses that were in the participant’s control. This would allow a researcher to sit next to a participant who was viewing his/her personal data, without the researcher being able to see it. Additional measures to protect users’ data included clear instructions on how to keep data safe before, during and after the study. A complaints procedure was also written at the request of the Ethics board.
Adapting to COVID-19: As COVID-19 changed working and living conditions in early 2020, Case Study Two was adapted to no longer rely on face-to-face engagement. The in-person privacy monitor approach was abandoned and replaced with an online Zoom-based approach. In this model participants would share parts of their data using screen sharing instead, and could move windows off screen to protect their privacy. The full study plan for Case Study Two was rewritten for online-based participation and was re-approved by the Ethics Board.
In writing up this thesis, I made a choice to foreground my three most major research activities as Case Studies, and not to detail the other activities carried out beyond the high level summaries included in this chapter. Case Study One and Two each span two research questions (RQ1 and RQ2 - see Figure 4 in section 3.4) as they explore both people’s relationship with data and the relationships people have that involve data. Case Study Three maps directly to RQ3, and is focused on designing human data relations in practice.
Because of the overlapping RQs in Case Study One and Two, I have structured the subsequent chapters as follows:
In this chapter, I describe the first major case study of this PhD, in which I ran four 2 hour participatory co-design workshops involving local authority support workers and parents and children from supported families that had recently participated in Early Help programmes, a targeted social care provision offered by local authorities to ‘at risk’ families across the UK. The purpose of the research was to build upon prior explorations to gain deeper understanding of family and staff attitudes to civic data holding (in pursuit of RQ1) and to move beyond this and explore the role of data within the support relationship (in pursuit of RQ2). A particular area that I explored was to consider the possibility of shared data interaction, where supported families and their support workers would interact with data together and in person as part of the support engagement.
In section 4.1, I will provide background on the Early Help context in England. In 4.2, I will review the prior findings from my own preliminary studies as well as that of others including Connected Health Cities, and show how these findings were used to establish a common ground within the sensitisation activities at the start of each workshop. In 4.3, I will describe the three themes discovered through qualitative analysis: that families want to be given a voice (4.3.1), that trust can be earned through data and process transparency (4.3.2), and introduce the concept of meaningful data interaction for families (4.3.3). In section 4.4, I will discuss these findings in the context of prior literature, drawing insights into the value of involving people with their data (4.4.1), the need for human interaction to make data interaction effective (4.4.2), and the pros and cons of the shifting of the locus of decision-making towards the family that shared data interaction would bring about (4.4.3). In 4.5, I will summarise the case study in terms of how these insights expand our understanding of the research questions and their wider significance.
In the UK, the social care system been shaped by a history of efforts, initially under the Every Child Matters policy programme [ADD REF], to improve the lives of children, especially those suffering the most. The Contact Point and Common Assessment Framework (CAF) programmes, were established with the aim to create universal digital tools to support co-ordination at a local level across public sector services, centred around around children and young people (REF Wilson et al 2011; Cornford, Baines and Wilson, 2013), later expanding to include their families (Malomo and Sena, 2017). A change of government in 2010 saw many of the policies around children and families moved from a basis of universal access to a targeted provision. Programmes such as Think Family [Cornford, Baines and Wilson (2013); REF Crossley] introduced a focus on family intervention as a primary approach; social workers learn about and get directly involved with the lives of targeted young people and their families in order to understand problems and to help empower them to overcome specific difficulties they face. The Troubled Families Programme (TFP), created in 2012 for England, was built upon a claim that £9 billion of civic spending was due to just 120,000 families and that a net saving of £11,000 could be achieved for each family that could be ‘turned around’. Local municipalities were required to work with partner agencies to identify troubled families5 – those ‘at risk’ families experiencing multiple issues from a list including unemployment, overcrowded housing, poor education, mental health issues, disability, low income, poverty, truancy, crime and domestic violence – and to work with such families to reduce these risk factors for them (Bate and Bellis, 2018). The TFP was set up in such a way that local authorities could claim central government funding for each family they had provably ‘turned around’, and as such encouraged estensive collection and use of data about each supported family to track and demonstrate progress and impact. This shift towards using data mirrors the societal rise of data-centrism described in section 2.1, but was also being seen across the public sector; under increasing pressure to demonstrate performance and deliver measurable, consistent results, all human services (including social care, health care and education) have become adept in the collection and use of data about their clients or service users. The use of data by the state as a means to represent and think about families is considered problematic (Cornford, Baines and Wilson, 2013; Barbosa Neves and Casimiro, 2018). For instance, from the perspective of the state, such data may include both objective facts from families’ lives such as address or family inter-relationships, as well as potentially more subjective information such practitioners’ observations or numerically-quantified measurements of risk. The risk of inaccurate data or unfair judgement is compounded by the fact that the clients of such services typically have limited access to this data. Although in theory families retain the ability to interact with services (and have access rights to data) the practitioners and the organisations for which they work become de-facto gatekeepers to the data about a family (Corra and Willer, 2002). This is then played out in a policy context where data-driven approaches to family care are encouraged through policy and reports about improving quality of the sector (Field, 2010; OFSTED, 2015; Bate and Bellis, 2018; Department for Education, 2018).
Over the last decade, Early Help programmes have become a key social care offering from almost all local authorities. These programmes seek to pre-emptively help individual residents voluntarily before statutory intervention is needed. Early Help was quickly identified as a suitable setting to explore the use of family civic data (a term I introduce in (Bowyer et al., 2018)) and its impact on individuals in this data-centric policy context. Connected Health Cities’ SILVER project, a Department of Health and Social Care funded project working across five local authority areas in North East England, aimed to improve Early Help support through improved use of family civic data. Through my embedded collaboration within this project, existing use of families’ civic data by early help practitioners and front-line support workers was possible.
The need to produce data for use as evidence for schemes like the TFP led local authorities to update their Early Help processes; support workers would now carry out an ‘early help assessment’ (a guided enrolment questionnaire) to create an ‘early help record’ (EHR) for each supported individual and their family, which is then stored in a case management system such as CareFirst, LiquidLogic or eCAF. To help form a holistic perspective of a supported family’s situation, a process of information gathering and family-centric inter-agency collaboration is adopted. The EHR is supplemented by data from other agencies reporting on an ad hoc or periodic basis (e.g. via emailed spreadsheets, phone conversations, and in-person meetings, such as the Team Around the Family (TAF) – a bespoke grouping with representatives from other agencies such as police, schools or housing agencies. This data is used to evaluate that family’s situation and progress against the ‘Common Assessment Framework’ [ADD REF]. Support workers are encouraged to use data as evidence at all stages.
An Ofsted report into UK early help in 2015 found that early help services across the UK were too inconsistent and recommended that greater standardisation in assessment and evidence-based practice were needed. Consequently, Early Help schemes continue to seek more data about ‘at risk’ individuals to use as evidence and to inform their care. Support workers, if provided with better data, can in theory make better decisions as part of the care they provide, and this belief that the best evidence is data is baked into national policies: ‘IT systems are most valuable when practitioners use the shared [between agencies] data to make more informed decisions about how to support and safeguard a child.’ (Department for Education, 2018). Such central policies highlight that in the UK, early help work is a data-driven service.
Despite this policy goal, the technical reality has been far more complex. Many different IT systems are used for social care, even within the same local authority; teams work in isolation using different systems and applications. The information ecosystem that the care services fit within is vastly complex (Copeland, 2015) with each part of the system having its own ICT systems and limited arrangements being in place to facilitate information sharing across the different data-holding authorities (which sometimes include local charities with their own ICT systems to which care functions are sometimes outsourced). The existence of different administrative boundaries for different authorities and agencies further complicates the situation. This fragmented ecosystem has proliferated due to each local authority being responsible for procuring their own IT systems in the absence (despite recommendations (Harbird, 2006)) of any centralised systems or information sharing standards.
The reality of information sharing in this context today is that many barriers exist – for example care workers can rarely access health data from GPs and have to rely on school nurses, health visitors, specialists or the individual’s own account. Where such information is shared, it is often in the form of emailed spreadsheets or reports, telephone conversations or committee discussions, and not supported by technical integration. No one team, agency or authority can have a full picture of an individual’s data (Malomo and Sena, 2017). Different operating policies, consent agreements, privacy regulations, technical access levels, system functions and staff competences result in different interpretations and limitations about what data can be shared (Malomo and Sena, 2017). Data should flow freely through the system in the service of individual care, but it does not, the public sector has a closed and fragmented ecosystem (Pollock, 2011).
Processes such as TAF meetings and the attempt to unify all information onto a single EHR can be seen as a recognition of this failure in the system to produce a single source of truth or understanding of individuals from a ‘whole life’ perspective. In attempting to create and expand the EHR as a central representation of truth about the family in order to inform care decision making, we can see data-centric solutionism (Morozov, 2013) being applied to try and solve a problem that was created by a data-centric approach in the first place.
While support workers often refer to data from the EHR, the families they are supporting have no access to the data records and are only aware of those aspects that support workers or TAF professionals choose to share with them; often such data is reported only in verbal form and would rarely be shown in its entirety. Critiques suggest more data may only consolidate more power in practitioners’ hands and further undermine the families they are meant to be supporting [Neff (2013);REF White and Wastell;REF Crossely]. The scattering of data across so many different systems and organisations, combined with informal processes for sharing, provide a serious opportunity for privacy breaches or mishandling of people’s personal data. At the most basic level, this might be a violation of consent – the passing of some data, collected for a specific purpose, to another authority for some new purpose without the data subject’s explicit consent for such use. The creation of the EHR as a source of truth carries significant risk of disempowering families further and countering the empowerment goals of the programme itself: The possibility of errors in the personal data that goes into the EHR is high, and might result in prejudice or unfair decisions being made. In more serious cases, individual privacy may be violated, or individuals put at risk, if a domestic abuser or criminal gained access to the record. The failure of such case record systems to properly represent families (Cornford, Baines and Wilson, 2013) produces further risk; information shared by one individual in confidence could be seen by another family member, and this could have extreme psychological consequences, such as an adopted child finding out they are adopted.
Data is not neutral (Gitelman, 2013; Neff, 2013), and collecting data within the context of the delivery of a specific service or intervention rather than as an objective collection of facts undermines local professionals’ discretion and organisational agility to deliver the care that is needed (Cornford, Baines and Wilson, 2013; Lowe and Wilson, 2015). This means that rather than improving the situation of a family the collection and use of data may be instead reinforcing the existing asymmetries of power that exist between data-holding organisations, the practitioners and the supported families (Cornford, Baines and Wilson, 2013).
This context therefore provides an ideal opportunity to study the dynamics of data use and its impact upon service relationships, in service of RQ2. Following preliminary sensitisation research with both families and support staff (summarised in 4.3 below), a study was designed with the objective of investigating the role of data within the Early Help support relationship, from both the individual perspective of both parties (in so doing deepening our understanding of RQ1), but looking at the power balance and effectiveness of the relationship as a whole, remembering that the ultimate goal of Early Help is to empower families to build better lives for themselves and get them to a point where they no longer need support. A further objective in exploring RQ2 is to explore possible alternative models for the use of data within Early Help relationships, and to explore the viability and potential benefits of such models with participants in pursuit of better and more effective support relationships and more empowered citizens. The approach taken to this objective is to conduct participatory research separately with supported families and with support workers to understand their separate perspectives, concerns and needs, and then to identify common goals and bring both parties together in further participatory work to explore and design solutions that would improve the relationship effectiveness for all in pursuit of those common goals.
As outlined in section 3.5.1, the first step in designing a study like this is to sensitise oneself as researcher to the study context. In this case, there were three things to familiarise myself with - the type of data being stored, the family perspective on the storage and use of that data, and the support workers’ perspective on the same. Importantly, I needed to understand how families and support workers understood and talked about this data, so that I could represent and refer to it in ways that made sense to them. To do this, I collaborated with colleagues in the SILVER project and at local authorities to see anonymised examples of what data was used by TAF/Early Help teams or mentioned by support workers as being of interest. I adopted the term Family Civic Data to refer to these types of data (further detailed in (Bowyer et al., 2018)) and organised these into different groupings and categories to create a taxonomy. I then created a taxonomic model of these data types, as shown in table 3:
| Category | Type of data | Examples/Details |
|---|---|---|
| Family | Personal details | Date of birth, address, telephone number. |
| Relationships | Marital status, ex’s, step-parents, living arrangements. | |
| Children | Parentage, adoption, fostering, childcare. | |
| Education | School Records | Attendance (truancy), special needs. |
| Academic Results | SATs, reports, exam failures, training courses. | |
| Welfare | Social Support | Social worker visits & notes, details of family crises, interventions, allegations. |
| Welfare Benefits | Jobseeker’s Allowance, child support, Disability Living Allowance, tax credits | |
| Money/Work | Family Finances | Salary, savings, credit cards, spending, debt |
| Employment | Job history, periods of unemployment,performance at work, NI, PAYE, pensions. | |
| Civil | Housing data | Council house provision, eligibility criteria. |
| Legal documents | Birth/marriage/death certificates,citizenship/immigration status, work permits. | |
| Crime | Criminal records | Arrests, cautions, offenders’ registers, prison time, speeding tickets, spent convictions. |
| Court orders | Restraining orders, lawsuits, custody, ASBOs. | |
| Domestic Violence | Allegations made, medical records,social/legal interventions, victim support. | |
| Medical | GP records | GP’s notes, prescriptions, tests, referrals. |
| Hospital records | Operations, hospital stays, emergency care. | |
| Medical conditions | Diagnoses, diseases, allergies, blood type. | |
| Mental health | PTSD, breakdowns, depression, sectioning. | |
| Addictions | Substance abuse, gambling, rehab, crime. | |
| Leisure6 | Library Usage | Books/CDs borrowed, computer access. |
| Sports & Health | Gym usage, class attendance. | |
| Shopping Habits | Loyalty cards, store & online purchases. | |
| Transport Data | Buses used, ANPR tracking, walking patterns. |
Early research recruitment attempts revealed that data is seen as an abstract concept in people’s daily lives; a dry, technical topic that many families feel unqualified to talk about. We needed to make these data concepts relatable. Drawing on the work of Brandt and Messeter (Brandt and Messeter, 2004) in creating design games, which observes that game pieces can be used to create common ground and as “things-to-think-with” (Papert, 1980; Brandt and Messeter, 2004), I created a set of data cards (shown in Figure 8 in the previous chapter), that would serve as a visual and tangible representation of Family Civic Data. By using these as boundary objects (Star, 2010; Bowker et al., 2015) the aim was to bring researcher and participants’ worlds closer together and to approach the concepts of data by directly starting with individual life experiences. A Data Card was created for each category in Table 3, including a summary and meaningful examples, so that the cards would be easy to digest, yet still contain sufficient detail to stimulate thinking. Keeping child-friendliness in mind, bright colours were a key element of the design. The cards were printed on high-quality, thick card with a glossy finish using a business card printing service to make them appealing and fun.
These cards were then used as research stimuli (see 3.5.2) within a preliminary study in which I met with four families in their homes7 and conducted a variety of participatory design activities and design games in order to explore family attitudes to family civic data. This study has been published at CHI (Bowyer et al., 2018) where its full findings are detailed, and these findings serve as researcher sensitisation to inform the main Case Study One. We found that once families had understood data as “stored information about their lives” they were able to very effectively engage and talk about it. The use of the games and the cards was very successful, keeping a light and playful environment and making the topic relatable. The topics on the cards served as a focal point that allowed families to talk freely about their own lives and views without feeling personally interrogated, as they were dissociated from the participants’ lives.
The families we spoke to did care very much about what happened to their civic data, contrary to the expectations of some of our peers, and perceived a variety of risks due to data mishandling including identity fraud, criminal targeting and psychological harm. Families felt that data could easily misrepresent them through errors, prolonged storage of data beyond its need, or the recording of unfair judgements and opinions. Families wanted to view the data stored about them. They wanted a set of basic rights - to be informed, involved and accurately represented, with the ability to see, explain and correct their data to ensure it is fair and accurate. They wanted to know that their data will be handled sensitively and only by those that need to know, and they believe that having these capabilities would help them to be able to work together with representatives of the state in a more positive relationship.
As well as the need for families to be given such rights, other implications we were able to draw from these findings were that family civic data is currently used as a proxy for them in decision making, which cuts families out of the loop, and that families should be given the opportunity to have a relationship with their data and also the opportunity to co-operate and have agency in the stewardship of their data. Further findings and insights are published in (Bowyer et al., 2018).
Through my embedded involvement with the SILVER project (see 3.4.1.1) I was able to complement my understanding of the family perspective on civic data use in Early Help, but also was able to acquire an understanding and sensitisation to the staff/local authority perspective on that same data use. SILVER conducted qualitative interviews with supported families, and the findings from these reinforced this need for greater inclusion of families in data handling, having identified that while families were willing to consent to their information being shared in order to improve their care, they had very little understanding of how it was used and could not be deemed to have given informed consent to the way their data is currently used.
SILVER conducted a series of “Amy’s Page” (Wilson, Wilson and Martin, 2020) focus groups/workshops with support workers and other local authority representatives, through which I learned that staff had a desire for greater access to health information, particularly mental health indicators. These staff revealed a desire to gather as much data as possible about the families they were working with. The workers viewed the collection of data as a useful raw material that enabled them to do their job better.
Collectively the findings from my own research and from SILVER showed a conflict between the desires from families and support workers – with families wanting more involvement and less reduction to data but support workers wanting to amass more and better data. In part due to its solutionist(Morozov, 2013) framing, the SILVER project gave priority to the support worker perspective as its requirements and continued to pursue the building of a richer data interface for support workers. This was the point at which my research objectives and those of the SILVER project diverged, as I was not ready to ‘take sides’ nor to pursue a purely data-centric solution; I wanted to explore whether it might be possible to satisfy the needs of both parties and to maintain focus on human-centricity and the need for a balanced relationship.
In searching for an approach to civic data use in Early Help that might help both families and support workers that could meet both parties’ needs while also addressing our research focus of increased data interaction within Early Help, I I began to explore the idea of shared data interaction; instead of the support worker being the gatekeeper controlling and limiting the family’s access to data, and accessing data ‘behind the scenes’ at their offices, what if data could be looked at, examined, and updated together, during the face-to-face encounters between families and their support workers? This could potentially bring all the benefits of human-data interaction (increased agency, negotiability and legibility) (Mortier et al., 2014) to families (and also to workers), while also serving as a boundary object that might improve the relationship itself (Bowker et al., 2015). In theory, it would allow families to gain some access to currently inaccessible data while also making it easier for support workers to ‘fill in the gaps’ in the data they already have by simply asking questions.
This concept emerged in part from the participants in the first phase (see below) of the research engagement, and became a main focus for the second phase, so that we would not only be exploring RQ1 and especially RQ2 in the context of current practice, but also be asking participants to imagine a different set of practices that might potentially serve their needs better. In doing so, we would be able assess whether the imagined model of shared data interaction might address both groups’ needs and whether or not it would be perceived to benefit the early help support relationship as a whole. Regardless of whether this particular model was a preferred solution, such an exploration would be helpful as it would put participants in a speculative, co-design mindset that would elicit deeper insights about how civic data should be used, not just expressing opinions on how it was used currently.
| Workshop | Engagement | Phase | Number of Participants | Activities |
|---|---|---|---|---|
| Workshop A | Design Workshop for Families | 1 | 8 adults and 9 children from 5 supported families | - Data Card Sorting - Sentence Ranking - Ideation Grids - Poster Design - Scenario Discussion |
| Workshop B (2 instances) |
Design Workshop for Staff | 1 | 36 support workers & related staff (in total) | - Data Card Sorting - Sentence Ranking - Ideation Grids - Poster Design - Scenario Discussion - Interface Discussion |
| Workshop C | Combined Staff and Parents’ Design Workshop | 2 | 3 support workers and 4 parents from supported families | - Sentence Ranking - Storyboarding Practice - Scenario-based Storyboarding |
During the summer of 2018, we conducted four two-hour co-design workshops, with two phases, as detailed in Table 4. In phase 1, the initial objective was reconfirm the findings of early work and gain a deeper understanding of both parties’ (families and staff) perspectives on data within the support relationship, by working with each group separately. A further objective was to learn about existing data practices and whether they work, or need improving (and where they do, to identify what the issues were). In phase 2, the objective was to work collectively with representatives from both groups to design imagined data practices and interactions for the shared data interaction model and to understand how in practice staff and families would imagine themselves using data together in the support relationship. Across both phases, a variety of participatory methods were used to explore these topics, as described in section 3.5.2 and 3.5.3. All workshops were audio recorded and transcribed. These transcripts were then analysed thematically, and in some cases quantitatively, as described in section 3.5.5. Refer to section 4.3 below for the major themes discovered.
Prior to the main exploratory activities, it was important to ensure that all participants arrived at a common understanding which they would use to approach their ‘design brief’. Also, there was a need to validate whether prior findings about the perspectives of staff and families held true for these participants too. To address both of these goals, a sensitisation (see section 3.5.1) and data-gathering activity called ‘Sentence Ranking’ was conducted, where participants were asked to consider a number of ‘opinion statements’ and rank them according to (a) whether they agreed, disagreed or were neutral on that statement and (b) whether or not they felt that statement was important. These statements, such as ‘Families should always be able to talk to someone about their data’ (more examples in Figure 21 below and the complete list of sentences are included in [INSERT REF TO APPENDIX SECTION HERE]) were collated from family and staff perspectives observed during the above preliminary study, from the SILVER projects own research findings, and from my own observations through interacting with local authorities as part of my embedded role within the SILVER project. In discussing and reaching consensus on these opinions, families and staff would be in effect ‘agreeing requirements’ that could inform their thinking during design activities. By conducting this same activity across all participant groups and across both phases, this would also allow comparison between the different groups to identify differences and find shared values.
Within each workshop, groups of participants sat at tables of 4 to 6 people, and each table provided its own sentence rankings. This produced numerical ranking data which was analysed as follows:
The data table for this analysis is shown in [INSERT REFERENCE TO APPENDIX]. The visualisation of these findings on shared values is shown in Figure 21. As the figure shows, there was universal agreement that:
Participants felt it important to address that current consent practices were inadequate. There was also strong agreement that families did not want to be responsible for looking after their own data, though this was felt to be an unimportant matter.
Participants showed considerable contention over whether or not support workers should be able to access historical family records (discussed further in 4.3.3.1), about how families would feel about the collection of data about them and about having responsibility to managing access to it. Most other sentences received moderate agreement.
[TODO: update the diagram to indicate (e.g. via a family symbol and a “support worker symbol” together with either a + or - sign and coloring in green/red) disagreement / agreement by the different stakeholder parties] [TODO: update the diagram so it doesn’t look like rows 2 and 3 are in the wrong order] [TODO: use a different word than ‘agreement’ within coloured boxes to avoid confusion; explain ‘neutral’]
Having completed the sentence ranking sensitisation activity, participants went on carry out the other co-design activities as detailed in Table 4. Findings from the analysis of these activities’ transcripts is presented in the next section.
The transcribed corpus from audio recordings of workshops A, B and C (approximately 120,000 words) was divided by activity, group, and family or staff focus into 85 different source texts. Each text was thematically coded and the coded texts were analysed through four cycles of analysis using the Miles and Huberman approach (Huberman and Miles, 2002). During this reductive process, participant creations, activity outputs and ranking data were referenced to add additional context to the interpretation. In this section, the qualitative findings from the thematic analysis of transcripts of workshops A, B and C are presented. In 4.3.1 the three main themes and subthemes are introduced, then each theme is further detailed in sections 4.3.2 to 4.3.4, including participant quotes8.
Given that our conversations with participants were framed as explorations of data use within the early help relationship, our findings are expressed as desirable best practices, some of which involved the proposed model of shared data interaction, within three core areas that participants see as beneficial to the early help relationship and ultimately to the family being supported: Meaningful Data Interaction (Theme 1), Giving a Voice to the Family (Theme 2), and Earning Trust through Transparency (Theme 3). From explicit and implicit statements from participants, contextual clues, and accumulated knowledge from being embedded in this context, we were able to judge whether the discussed best practices were commonly in use (“current”), happening occasionally/partially (“emergent”) or not yet occurring at all (“imagined”)9. Tables 5, 6 and 7 shows the subthemes within these themes, along with illustrative participant quotes, and indicates the current, emergent or imagined status for each subtheme. Structuring the themes in this way facilitates the functioning of these findings as constructive, actionable input for Early Help (or other social care) system and process designers.
| Subtheme | Description & Quote | Status |
|---|---|---|
| Understandable Information Summaries | To maximise understanding, simple summaries of the information within families’ data should be available to both families and support workers. Visualisations should be used to ease comprehension, and information should be contextualised at different levels (individual, family, community). “There’s so much data that’s stored. For me, for a parent [I want] to understand that through a text or email but just in point form. […] The less written, the better for the parent. [What we need is] a small synopsis […] like a summary view.” [Parent, SQ44] “Some families will go, ‘Well you know that information because it’s all there somewhere.’ We’re like, ‘Yes, but we don’t want to trawl back to eight years ago.’ There’s reams and reams and reams of it [data].” [Worker, SQ40] |
Emergent |
| Interact With Data Together | Support workers should work to actively counter the knowledge imbalance by informing families what their data says. They should make use of specific datapoints as talking points to aid planning conversations. “You could have a table, you’d look at where they are and where they could be. [You could say] ’This is where you are now but if you [take these specific steps], even though you’ve got a criminal record, you could progress to this level.” [Worker, SQ29] |
Emergent / Imagined |
| Direct and unified data access | Individuals should be able to directly access their civic data through a personal interface; this should be be a single, common place where all of an individual or family’s data is brought together to give a complete and consistent overview to all parties with a need to know. “[I’m imagining an] online database of personal family info accessible [only] by people, practitioners that have permission […] I would say that it’s only who you want [to give access to, that can see it]. You would have your private code which you could hand out, like the doctors give you appointments.” [Parent, FQ8] |
Imagined |
| Ongoing Data Access and Support | It is not sufficient simply to give access to data. Families should be able to access information in their own time and should be supported in understanding it. Most importantly they should be able to ask questions, challenge data records or start a conversation to discuss their data at will. “[The families would have] a little app which they can log into and read all their information - what’s recorded about themselves, […] who we share the information with […]. If they’re not happy […] they can fire off an email to us and let us know what they disagree with or if they want their information taken down or their consent.” [Worker, SQ51] |
Imagined |
| Subtheme | Description & Quote | Status |
|---|---|---|
| People not Records | Support workers must always treat people like individuals, that are more than a data record. They should review family data before contact, but must always engage at a human level too, avoiding making any judgements based solely on data. Worker A: "You should never make a judgement on data… that data could be wrong. Worker B: “It takes individuality, working with that person as well, doesn’t it?” [SQ11] |
Current / Emergent |
| Checking Data Together | Families should be explicitly invited to review, discuss, check, correct and approve data records. Data recording should be visible, and workers and families should check data together. “[The parent could] countersign. [The worker would] say, ‘I feel that we’ve talked about this today so I’m going to write that down. I’m going to show you. Can you sign and me sign if you’re happy and I’m going to share this.’ That’s a bit different [better].” [Parent, FQ12] |
Emergent / Imagined |
| Changing Lives Means Changing Data and Changing Consent | Recognising that families’ lives are in constant flux, routine reviews of data should occur, and they should be invited to regularly review their choices regarding data collection, keeping and sharing. All systems and processes should treat data as fluid and flexible, not static unchanging facts. Feeds of recent changes should be available to both parties. “[There’s] this perception of something sticking with you even after you’ve potentially reformed. […] That’s something that happened a long time ago and that judgement is still there but [you’d be wondering] ‘Okay, is it [true]?’” [Worker, SQ61] |
Imagined |
| Individual Agency & Family-sourced Data | Individuals should be able to create or contribute their own data to tell their own story and annotate particular datapoints with their own explanations. Worker A:“If you read information […] about me, you wouldn’t expect to meet the person you meet.” Worker B: “That’s it. It’s the same for everybody.” Worker A: “[…] It just [has] basic things in most of the time, doesn’t it […]. You’re not a person [in the data record] are you really?” Worker B: “[I’d like it if you could] give your bit of personal data, your own story.” Worker A: “Yes, because everybody makes mistakes and there’s probably thousands of people out there who have got a criminal record and have never done anything since. [They’re] getting judged by having one thing [but they should be able to write] ‘Yes, I did this because of this situation but this is what I’ve done to make myself [better]…’” [FQ10] |
Imagined |
| Granular Access Controls | Families should be given controls to manage access to their data and configure and change preferences at a fine-grained level. “[Families need to] feel they’re being involved. […] [We need to be able to] sit together and say, ‘Right, that’s the information I’ll allow you to share. I don’t want that bit shared. But this bit, because it will help me and the family […]’. Say in this [scenario] family, she might have been married before and had domestic violence so she doesn’t want that bit shared, that’s in the past. So it’s [only] certain up-to-date information about the family [that would be shared] because this [the family suggested by the data] isn’t her family.” [Parent, FQ16] |
Imagined |
| Subtheme | Description & Quote | Status |
|---|---|---|
| Transparent, Respectful Data Handling | Support workers should treat families’ data with the utmost respect, keeping it safe, ensuring it is not used beyond its intended purposes, shared without consent or put at risk. When talking to families about data, it is helpful to focus on positives and strengths and not use it as a means to criticise. “There was a time where I was at the doctors’ and they asked how many units of alcohol I drank, and I said, probably about three bottles a week, at the time, not any more but later on [the support worker] pulled me up on it and they had it down as three bottles a day. That could have caused an issue was anyone ever to ask.” [Parent, CQ7] |
Current |
| Always Seek and Demonstrate Greater Understanding | Support workers should always assume that their understanding from data is incomplete and should seek to learn about individuals and build a more complete picture of their lives. By showing this effort and their growing understanding, they will engender trust. “You don’t want to reduce them to this number in a database. You want to understand their actual experiences and support them in getting better.” [Worker, SQ74] |
Emergent |
| Pro-actively Challenge Data-centric Norms | Support workers and agencies can recognise that current systems and processes are data-centric and imbalanced, and can strive to change this through their actions: being as open as possible about how families’ data will be handled, ensuring that proper oversight mechanisms exist for data handling especially in the sake of contentious issues, and that data is shared openly but consensually between authorities. “It hasn’t been explained property to this [scenario] family that their information will be shared with other professionals. So, they’ve been left feeling really let down and probably quite angry about it. So, although that information does need to be shared, they [the support workers involved] ought to make the family properly aware that information will be shared.” [Worker, SQ18] |
Imagined |
Through our discussions with families and support workers we gained a deep understanding of what sort of data interactions were considered ideal for a family. Setting aside interface considerations, which were not the main focus of our enquiry, and focusing on the wider sociotechnical context around the data and its access, the key requirement we uncovered was that in order to maximise understanding for all parties, data interaction needs to be meaningful – this is the first theme of these findings. Encompassed within this concept are the need for understandable and effective summaries and visualisations, the need for direct and ongoing data access with human support, and the recommendation for families and support workers to interact with data together within the support interaction.
Written summaries of information were independently considered to be critical for both parents [SQ44] and support workers [SQ40]. These could also be used as a mechanism to protect privacy, by keeping sensitive details hidden:
“In that example, depression, ten year ago, that shouldn’t be on there for the support worker. All they should get is if Social Services have been involved and it should just be, ‘Please contact for more information.’ […] [The system should stop workers from] getting a list of all the kids who have ever missed dental appointments or when you were depressed ten years ago. […] There needs to be a thing where it’s, sort of […] key trigger words, where if the word comes up a lot of times, it spots the patterns. Whereas, if [a problem] is mentioned once, it should only be [shown] at the highest level.” [Parent, CQ10]
Because the amassing of large volumes of historical data is expected, families expect (though are not happy about it [FQ6]) that any aspect of their past life may be ‘findable’: “We go to them and say, ‘We’re aware that you’ve got these issues going on’ […] and not one family I’ve ever met has said, ‘How on earth have you got that information?’” [Worker, SQ42]. Managing expectations can be problematic [SQ40] and some workers felt they should not be given greater data access, fearing greater liability to ‘trawl through data’ so that they know everything.
This need for summaries can also be seen an echo of Gurstein’s call for ‘effective data use for everyone’ (Gurstein (2011)). It is not sufficient to simply open up public sector databases to allow individual record access; families need not just the opportunity, but the technology, skills, formatting, interpretation and sensemaking to make the access effective. Some individuals may lack “proper access to a computer.” [Parent, CQ9]. Data tables are insufficient and may need to be supported by visualisations: “Some families might not understand [a data viewing interface]. They might not be technical… I think sometimes it’s easier to do it in pictures.” [Worker, SQ43]. Participants suggested pie charts, graphs, spider diagrams and timelines [SQ30, SQ31] or even an audio interface for the visually impaired [SQ45] to aid understanding. Visualisations also need verbal explanations [CQ11].
We noted that it is not clear who could or should do the skilled knowledge work of creating these representative and accurate tailored summaries and visualisations.
Directly using data together within a support conversation is seen as a key element of making data interaction meaningful for families. For support workers, the use of data can form ‘a way in’ or conversation starter:
“[Showing the data could be] an ice breaker [with] a new case. So, ‘We’ve got this information; can you tell me more about it?’ That opens it up, like a can of worms and it all just comes out; you know what I mean? Then you’re able to have that open and honest conversation with them to see what level of support that they need.” [Worker, SQ28]
The showing of data performs an additional important purpose, combatting the lack of awareness of what data exists and who holds it [SQ39]. Currently, much of the data stored about families is invisible to them: “Families really only see the data that we [support workers] want to present.” [Worker, SQ37] Regardless of families’ legal rights to request copies of their data, our understanding is that this right is rarely used [SQ38], and typically only when filing complaints. Lack of awareness can not only cause suspicion [SQ17], but also incorrect assumptions that support workers ‘already know everything’.
Participants particularly recognise the value of referencing data points over time (such as a record of welfare scores that support workers have previously given them), for example to track progress [SQ29, shown above in Table 5]. This could motivate and reinforce progress [SQ6] by relating behaviours to consequences [SQ32] – essentially facilitating data-based decision making. Reviewing historical data is preferrable to verbal description: “Whenever you go through stuff like that [verbally], especially historic stuff, they can be quite remote so [having the data in front of you] would be good for that.” [Worker, SQ33].
Despite the reality that families currently have no direct access to their civic data, family participants all eagerly described designs including apps, intranet terminals, online chat facilities, and self-service webpages, all offering individuals the ability to view their own data; there is a clear demand for personal data interfaces, which could empower families to use their own data: “they could quickly tap onto the app […] and show somebody else where they’re at.” [SQ54]
“Our first [idea] is the lovely [child’s name] has made an app. [It’s] free to download, you can make your own password and there’s going to be a button on it so you can press it and then query the information that’s held on you straight away.” [Parent, FQ7]
Workers and families shared a desire for one single point of access for data, usable by all parties [SQ25, SQ26], though families ‘don’t want to be responsible for looking after all our data’[FQ17, S5]. Bringing together data from multiple sources would allow patterns to be spotted by correlating data from different sources, which workers perceived would help their preparation: “[This imagined interface] would provide individual histories but you could also pull them all together so you can prepare, so for instance if mum was having some significant issues with mental health, you might be able to correlate the [child’s] school attendance alongside that and find out why that’s happening.” [SQ8]
Families, being accustomed to accessing information in other parts of their lives through smartphones and web interfaces, expect that any civic data interface would allow them to access data “in their own time, at their own pace” [Parent, CQ12]. Currently access only possible via the support worker, functioning as a gatekeeper within the support interaction, so opportunities to reflect upon the data are limited in time and coverage:
“[If conflict occurs,] I would need to go away and seek some advice on what can happen next, but it could be useful for the family, to spend that period of time, perhaps looking at all the information and identifying what it is that they feel they’re being judged on.” [Worker, CQ13]
Timely access to data could be empowering, as families could track their own progress, enabling them to make plans outside of the suppport relationship, reducing dependency upon support, in line with the ultimate goals of the programme:
“If we were working with a family about school attendance, could we then link that in to [the families’] app so parents [would be] aware of what their attendance looks like at this point in time and they […][could] monitor it themselves and take accountability.” [Worker, SQ49]
As well as having ongoing access to data, families need human support to understand that data [SQ49, CQ11]. All participants agreed that ‘Families should be able to talk to someone about their data’ [S7]. Explanations are needed [CQ11] with language and vocabulary adjusted to individual literacy [SQ46] or age [SQ47]:
“No matter which [presentation of data is offered], you’d have verbal context for it as well, wouldn’t you? You wouldn’t just go, ‘There’s your app’ or ‘There’s your piece of paper’ and leave them. You’d just talk it through with them anyway.” [Worker, SQ49]
Key to meaningful involvement is the ability to start a conversation. Groups imagined families being able to send a message [SQ51] or record audio to raise an issue for discussion, letting their disagreement be known and empowering them to be part of a dialogue about what is recorded [SQ60].
The second theme of these findings is that there is a need to update processes and systems, which currently rely largely excessively upon the ‘facts’ within the data record, need to be updated to give the family an empowered role within their civic information ecosystem. The purpose of an early help intervention is to obtain more information for a better understanding of the family’s situation and to make evidence-based plans and decisions to improve the situation, so seeking objective truth is clearly central; impressions of that truth can be formed either by reading the data or by talking to the family. We uncovered benefits and dangers of relying solely on either source. Families should become agents in the data ecosystem, and this involvement should lead to both greater empowerment and better evidence-based decisions.
We found evidence, consistent with literature (Gitelman (2013)) and my earlier study (Bowyer et al. (2018)), that data can never represent absolute truth - it is often biased or incomplete, and this can mislead [SQ59 (shown in Table 6 above), FQ11A]. For example, a lack of mental health information could make an individual look like a poor parent [SQ12]. Families may be less willing to ‘open up’ if they feel they may be judged unfairly [SQ14]. Therefore, developing a strong relationship between worker and all family members is key to understanding the full picture [FQ1]; to ensure fairness [SQ77], data must be current and complete [SQ13], but this state can only be achieved with the family’s cooperation. Looking at data will never provide support workers with a complete understanding. Yet, workers often ‘tend to just trust that everything that has been put down is right’ [CQ1], allowing the data perspective to dominate. Such assumptions should be avoided [SQ10]; processes must recognise maintaining human face-to-face dialogue as a priority. Data should only provide supplementary insight: “You should never make a judgement on data… that data could be wrong. It takes individuality, working with that person as well, doesn’t it?” [SQ11]. All participants presented with the sentence “Public sector officials can make good decisions just by looking at a family’s data” [S18] disagreed with it.
In spite of the warnings above, the data record is undeniably useful; over 80 comments from workers contend the current practice of reviewing a family’s data prior to meeting in person is beneficial, because it provides useful background that will help them identify support needs. For example: “I had a family where trying to unpick what had happened, over ten years, to the child, was really difficult. So, I went away, got the information and came back and if you have […] that picture of how the family works [when you meet them], [that helps].” [SQ1] Additional benefits identified included safeguarding workers [SQ3] or giving them an ability to ‘check the family’s claims’ so that they might constructively challenge individuals [SQ4]. Supported families echoed the value of workers reviewing data [FQ1A], and saw benefits included ‘not having to repeat your story’ [SQ5].
The compromise that participants identified over the use of data is that workers should avoid making judgements based solely on data. While sometimes providing essential background to a worker [FQ11B, SQ62], historical data in particular often leads to inadvertent prejudice, especially where labels are used [SQ9]. No participant disagreed with the sentence “Labels like ‘domestic abuse’ are damaging to families and hard to shake off” [S15], and workers recounted experiences of being uncertain how to judge historical issues: “[There’s] this perception of something sticking with you even after you’ve potentially reformed. […] That’s something that happened a long time ago and that judgement is still there but [you’d be wondering] ‘Okay, is it [still true]?’” [Worker, SQ61]
Many participants concluded that only ‘relevant’ information should be available, to those who ‘need to know’, but the wide range of opinions we saw expressed suggest that this is a highly subjective judgement that would be difficult to determine. A cut-off period before which workers should have no right to look was suggested [Parent, CQ15], but the sentence ‘Officials should be able to see historical records about families’ [S17] was contentious. Some workers feared any restriction in access might mean they miss important background on an individual’s past, such as sexual abuse or mental health issues [Worker, SQ76]. The solution to this dilemma is unclear, but transparency about what is in the data would seem to be a critical ingredient (see 4.3.4).
The idea of families and support workers reviewing data together arose from many of our participants in workshops A and B, and this led us to explore this concept of ‘shared data interaction’ in more depth through the storyboarding exercise in workshop C (see 4.2.4 above). Families perceived value in having not just data representations (as in 4.3.2.2 above) but a data interface present within their care meeting, so they that they could see actual data and have it explained to them. One practice embodying the concept of transparency that is emerging in some care services is the use of ‘2-in-1’ devices (laptop/tablet hybrids) within the care engagement so that workers can visibly record data in front of families and then ask them to ‘approve’ the accuracy on screen [FQ12, SQ67]. Participants believed this would help to build trust between the support workers and families; if a family begins to feel powerless, they may disengage [SQ35], but even minor involvement, such as this emergent practice of signing off approval of data records [FQ12] or an imagined process of checking & correcting data records together (see next section) could make families feel more empowered which could make the support relationship more productive.
Family participants imagined going beyond just seeing and getting verbal explanations of their data to being able to review their data and be asked for their approval of accuracy [FQ3]. Maintaining accurate data is important because that data is used to decide care plans and support strategies. Families are thought to be better placed than anyone else to identify inaccuracies or gaps in their civic data, and participants believe family corrections would increase data accuracy. This does not mean free editing of records (as, for example, as discovered in the earlier study (Bowyer et al. (2018)) fears and/or self-interest could lead to families misrepresenting themselves in data) but rather taking a role in reviewing, annotating, explaining, or requesting changes, through direct data-centred collaboration between involving workers and family members:
“[There would be an] individual view where each person within the family would have their own section […] you could sit with them […] and go through the data that we have got which would enable them to change anything that they want taken out.” [Worker, SQ66]
Shared data interaction carries the potential to bring benefits in accountability, accuracy, simplicity [SQ25, SQ26] and consent.
One reason for reviewing historical data and for requiring dialogue with the family to gain an up-to-date picture, is that the truth changes over time. People are not static, and families’ lives are always changing; given marriages, divorce, birth, death, house moves and other changes, data can become out-of-date simply through inaction. Given this, asking consent once at the start was considered insufficient [S3]. Data is inherently static – it does not change, but people do [SQ61, SQ63]. This was the basis for participants’ desired practice that not only the content of the data, but the family’s consent over what happens to that data that both need to be reviewed regularly [CQ16]. A process of regular reviews around data use could prevent unwelcome surprises about how family data is handled [CQ2, CQ17] which could damage trust and hinder co-operation. Participants imagined data systems issuing notifications or update feeds for families and support workers showing significant events or data updates [SQ64]. Support workers currently get notified of police incidents, safeguarding concerns and hospital admissions, but alerts of data changes across the care ecosystem could provide useful triggers for reviews or discussions:
Worker A: “We would get a report through to say…” Worker B: “They’ve recorded something.” Worker A: “Yes. Then I suppose we would follow it up […] face to face.” [SQ65]
Regardless of the particular mechanism, it was ultimately felt that both data systems and support processes need to do a better job of supporting change.
The idea of families reviewing data has significance not just for how it can help within the support interaction, but because it can give families an independent role in their data ecosystem. Both families and support workers imagined the family being able to interact with their civic data on their own, something that is currently not possible. This is a vital step for empowerment: if something goes wrong, families must be able to discover this and must be able to do something about it. Without a cycle of feedback involving individuals as stakeholders having the ability to review and correct data, data will quickly become inaccurate (Pollock, 2011). Thinking about data interaction at home unlocked additional thinking, such as families helping to fill gaps in data [SQ57] or contribute new data that may not otherwise be recorded [SQ58]. Giving families the ability to contribute new data would empower them to ‘tell their own story’ [FQ10]. Many participants saw this as-yet-unavailable capability as expected common sense:
“I just generally want to see [what is stored about me] just to know what people are saying and then obviously if it’s wrong, I can correct them on it.” [Parent, CQ14]
Rather than solely relying on dialogue, families could provide new data more directly, e.g. through a ‘family network app’, which could also increase their sense of data ownership:
“It would [ask them] who [professionals the family is involved with] they could name outside of their family to create a network. […] But it would collect more than that, […] it would allow the family to be accountable for their data collection and making sure that it’s accurate […] because we often go away and record it all on [our existing database] and it’s our story rather than their story of how the events occurred.” [Worker, SQ36]
With new ways for self-expression, families could add context for support workers [FQ9, SQ55], unlocking new support topics [SQ56]. The overriding sense from both groups was that families having the ability to annotate or explain their data would allow them to hold authorities to account, and empower them to tell their story and ‘show the real me’, as illustrated in [FQ10, shown in Table 7 above].
Participants identified that it is important to consider that different individuals within the family would have different roles, access and summaries, in order to respect individual privacy [SQ52, SQ48]. Psychological harm could be caused through information leakage, for example an adopted child finding out their true parentage (Bowyer et al. (2018)). To avoid this, data should be managed carefully with consent being less binary and more fine-grained access controls being offered:
Worker A: “When a child turns 16, when they go to the doctors, is that confidential between me and my GP or can my parents see that?”
Worker B: “I think it’s confidential.”
Worker A: “Exactly. So in this interface, I [would be] able to see that – [as the] 16 year old - you as my support worker could also, but not my mother.” [SQ53]
Once such capabilities are established, this could enable much more careful and deliberate forms of data-sharing which could support the creation of a personal data ecosystem (see section 2.3.4) beyond, but centred upon, the individual family member, all the while remaining under that individual family member’s control:
“[I’m imagining an] online database of personal family info accessible [only] by people, practitioners that have permission […] I would say that it’s only who you want [to give access to, that can see it]. You would have your private code which you could hand out, like the doctors give you appointments.” [Parent, FQ8]
Looking at Theme 2 as a whole, we can see that giving families a role in the creation and stewardship of their data selves has great potential to unlock new capabilities and a sense of empowerment for families.
The third theme looks at these imagined new data access capabilities and empowered role for data subjects in the wider sociotechnical context of how they could affect the support relationship. The topic of trust arose directly or indirectly in almost all participant conversations, and our findings show that transparent and open data handling and decision-making processes are key to support workers to earn the trust of supported families. Currently, families are mostly unaware of what data is held about them and what discussions about them are being had and have no choice but to trust both the support workers, and all the parties and technologies involved in the surrounding care ecosystem, which is very hard to do when they have little to no visibility of it. Without visibility, any error or surprise can be very damaging to this fragile trust and can harm the relationship, and conversely, increase transparency and explanation can avoid surprises and increase trust, improving the relationship.
The findings in Themes 1 and 2 above clearly suggest that in seeking the best possible understanding, families must be engaged in a fact-centric way, which requires trust in the support worker (to interpret and record data fairly and accurately) and in the system (to keep data safe and prevent misuse). A good relationship with the support worker is critical [FQ1] to the family’s care. Workers recognise the importance of being transparent with families:
“I think that [families] have got a right to know what is held about them and what is said about them.” [Worker, SQ50]
Even for data that would itself would be considered uncontroversial, a lack of awareness to that data or a lack of transparency on how data is informing judgements can cause great worry to families:
“Some people that I’ve worked with, I think as soon as they know you’re holding information about them they get really tight and [say],”What are you holding about me? […] They don’t like people knowing what’s going on in their lives." [Worker, SQ70]
The current approach, which relies on the support workers mentioning data that they consider relevant, can reassure families when they are kept thoroughly and regularly informed, but can result in expectations being broken by accidental sharing of information if its sensitivity is overlooked:
“That tends to be the biggest problem with this, these little bits of information that nobody ever thinks are relevant to bring up in everyday conversation and they’re coming out.” [Parent, CQ3]
Data must be handled respectfully, with attention to family and individual privacy. A lack of transparency and trust can lead to an atmosphere of suspicion [SQ17] where families have ‘a totally overwhelming feeling of people checking up on them’ [SQ71] and apply extreme scrutiny to what they are told: “You can get families who [no longer] believe what’s being said about them.” [Worker, SQ73]. Fearful of consequences [SQ72], families may withhold information:
“Well my thing would be who is [my data] going to be shared with? Which authorities? What is going to be shared? […] If I ask for help because my son has got massive behavioural issues and I’ve been trying for years to get help with him and […] if I go to social services, are they going to come in and think I can’t cope because I’m on my own with five kids? Are they going to take all the kids away? That’s my thing. So I’m terrified of Social Services, I really am.” [Parent, FQ14]
Respectful data handling also includes using tact and discretion when referencing data, and a common current practice is the use of a strength-based approach [multiple workers in workshop B] when presenting or referencing data that could be perceived as particularly negative or judgemental; looking for the opportunities for growth rather than seeking to criticise.
An open and respectful approach is rooted not just in decency but in practicality as a co-operative family is easier to support: “Because if someone is feeling judged or stressed or angry or whatever, then they can stop the conversation” [Parent, CQ5]. Being transparent with data can also help with accountability and accuracy, which can detect and prevent mistakes earlier:
“There was a time where I was at the doctors’ and they asked how many units of alcohol I drank, and I said, probably about three bottles a week, at the time, not any more but later on [the support worker] pulled me up on it and they had it down [in the data record] as three bottles a day. That could have caused an issue was anyone ever to ask.” [Parent, CQ7]
In current practice, data handling is generally respectful - data mishandling and unexpected uses of data are currently mostly avoided; but transparency is low, making the perception of respectful handling quite fragile and entirely based upon trust rather than direct experience.
In order to earn, build and maintain trust, support workers must always be seeking to form a completer and more up-to-date picture of the family, in line with the finding in 4.3.3.1 that individuals are more than what is stored in their records, and this requires human interaction to uncover. Demonstrating a deep understanding of the family, and that a family’s lived reality has greater priority than what a database says can be a critical to trust-building: “You don’t want to reduce them to this number in a database. You want to understand their actual experiences and support them in getting better.” [Worker, SQ74]. It is important that families understand workers’ good intentions when accessing data about them [FQ15]. However, if workers had to show all available data to families this could make it challenging to maintain good relations, “because literally [the data we have] is like everything, isn’t it? So I don’t know how I would feel…” [Worker, SQ21]. In addition to avoiding breaches of expectations (see Theme 2 above), a transparent approach ensures that the privacy of families is respected, because data is not used in decisions without the chance for explanation:
Parent: “I don’t want everybody knowing how rubbish I am with money.”
Child: “That’s my life.” [FQ2]
Participants also indicated that families’ desire for transparency (as mentioned in the previous section) does not just imply reporting data usage, they need dialogue and human engagement to give them reassurance; trust and reassurance can is best achieved through a conversation [FQ1], not a data interface. Support processes need to change to better recognise the role of dialogue, rather than just consultation of a database, as the best way to achieve a rich and nuanced understanding.
Exploring this need for reassuring dialogue in more depth, we see that to avoid damaging negative spirals of emotion, deliberate openness is needed from support workers (and the entire care system) [SQ18] as to what information is held, and how it will be used and shared, in order to alleviate fears of data being used ‘against’ families that can arise without that transparency – giving them instead confidence that their interests are being protected, thus putting them at ease [SQ20]. To our understanding, data handling processes is only done once in very loose terms during initial engagement with a family for the purposes of collecting informed consent and rarely revisited. We found that workers could easily imagine explaining data practices in greater detail than they currently do [SQ41] and clearly there is a need for proactive action by workers to counter the inherent knowledge imbalance of data being collected into systems that they are gatekeepers for.
Workers however lack control over the quality, coverage and timeliness of the family data and see this as a systemic issue they could not adequately address. From my experience with early help teams through the SILVER project (see 3.4.1.1) it became clear that while support workers can see more data than most, they have far from the complete picture; in fact, there is no one organisation or individual with visibility of the entire family-information ecosystem, suggesting that greater openness with data would benefit not just the family, but other civic actors involved in the family’s lives and in their care. Some participants suggested that openness about data handling needs to accompany data access, so for example if browsing information together (as described in 4.3.2.2 and 4.3.3.2), it would be important to explain where the information has come from and why the support worker has it, rather than just reporting its content:
Parent: “[if the worker knew sensitive medical information] the family would be really annoyed, they would just want you [the worker] to go.”
Worker: "I’m the same, me. I’d be like ‘I don’t know how you got all this?’. That would be my first reaction but then if we [were to] discuss it and browse the information with the family [that would work better]." [CQ6]
As mentioned in 4.3.3.3, there is a need to replace the current practice of treating consent as a one-off formality at the start of the support process with something better. In our earlier study we identified this as a need for dynamic consent (Bowyer et al. (2018);Kaye et al. (2015);Williams et al. (2015)]. A common heuristic expressed by families here and in the earlier study is that data should only be seen by those that ‘need to know’, but this is very hard to achieve: first, because without transparency of data handling, a family cannot verify whether this is happening, so has to rely only on feelings and supposition to inform their trust. Second, the need for fair judgement over who should access families data is objectively important given that some support workers expressed a belief that their right to access families’ data should overrule families’ consent:
Worker A: “I think to enable us to work with families, we need to have as much information to give them the best possible service. So, I think we should be able to [access their information] regardless of what families say.”
Researcher: “Regardless of what they say?”
Worker A: “I do, yes.”
Researcher: “Does everyone feel the same way then, that they don’t get a say?”
Worker B: “Yes, because you need as much information as what you can.” [SQ22]
This suggests that to ensure the ‘need to know’ is determined fairly and accountably, independent oversight might be needed; other situations that would benefit from this include deciding what parts of a medical history are ‘relevant’ [SQ23], arbitrating situations where legal duties may require the breaking of consent [SQ24], and being able to identify and address situations where recorded information may not tell the full story [CQ8].
These findings suggest that not just transparency but a progressive attitude to data practice, actively challenging current data centric norms, would enhance trust around data handling access and decision-making as requirements and lead to a healthier support relationship. This could even include thinking about new ways of using data, for example at a collective community level [SQ78], to promote an open data-sharing culture.
Through the workshops described in 4.2.5, I have succesfully advanced my understanding of the human experience of data (RQ1) and the role of data within service relationships (RQ2), specifically for the Early Help context. In the section, the findings described above in 4.3 (and the preliminary findings in 4.2) will now be contextualised in respect of existing literature from Chapter 2 and beyond, drawing conclusions as the value of involving people with their data (4.4.1), the need for human interaction to make data access effective (4.4.2), and the possible impacts of a shared data interaction approach in terms of shifting the locus of decision making closer to the supported family (4.4.3).
The above analysis of attitudes to data usage in the UK early help context reveals that data about supported individuals and their families is already an integral part of current care practice, providing great value in building up a more complete picture of a family’s life, in service of better support and decision-making. However, this comes at a cost to the family’s autonomy and we have identified a number of problems with the prevalent mindset in the care system – which is that, just as in the commercial sector (see 2.1.2 on dataism), families’ civic data is considered as a resource to be utilised. This mindset carries an implicit assumption that data is an objective source of truth, which our participants tell us it can never be. Supported families lack awareness of what data is held about them and how it is used: this can lead to false expectations and surprises and in the worst cases, this can feed feelings of fear or suspicion which can harm the effectiveness of the overall care relationship. The present data-centric approach across civic systems mean that stored data can often serve as a proxy for families’ involvement, and without any involvement of the family in checking data accuracy, is susceptible to inaccuracies and errors of judgement due to out-of-date, incorrect or missing data, which can directly affect supported families in the form of prejudice, discrimination, or privacy violations (Bowyer et al., 2018).
Our findings that trust is critical to an effective support relationship are consistent with literature which states that trust in the independence and integrity of the data-collecting and data-holding institutions is essential (Dijck, 2014). Trust currently rests upon feelings and impressions rather than the true accountability families would get by seeing what data is held and how it is used. This trust is often absent or reduced due to Early Help services not involving families with their data. Families must trust not only the system, but the support worker themselves; our findings suggest the best way for a support worker to build trust with a family is to show that they have, and are continually striving to develop, an ongoing and deep understanding of the family as individuals, whose perspective is more important than ‘what the computer says’. The more they are treated as people, not ‘objects to be administered’ (Cornford, Baines and Wilson, 2013), and the greater awareness and access they have to data records and data handling and decision-making processes, the greater the trust they can have in the system and the more effective the relationship will be. Shared data interaction practices such as checking data together, visible data recording, family sign-off, or contribution of their own perspectives as data, give the family direct evidence that they are being listened to and that their viewpoint is important even when it contradicts the digital record, which would be very powerful in building trust. Transparency of processing allows accountability – something that is currently all but impossible, and this would further empower families by allowing them to gain confidence that they are being treated fairly and that data about them is accurate (established as requirements from families in the preliminary study (Bowyer et al., 2018)). It is evident from our findings that a trustworthy care system requires the direct involvement of the individual(s) being cared for and that the mechanisms of shared data interaction offer specific shapes in which that involvement could take place.
Consistent with field studies such as the World Health Organisation’s decision-making tool (Johnson, Kim and Church, 2010), we found evidence that staff and supported families believe they would be able to collaborate more efficiently through shared data interaction as it would be more evidence-based. This has the potential to remove inefficiencies such as spending time correcting misunderstandings or repairing damaged relations caused by misjudgement, and the emergent practices of using data to track progress are already proving to be an effective and tangible way for families to improve their situation; giving them personal data interfaces would unlock the ability to track this data outside of the support engagement would empower them even more to be self-sufficient. A digital health innovation project in South Africa echoes our findings on the importance of trust, agency and involvement of the individual: “The user must feel or experience trust, have to change behaviour, feel that they can control and increase their own access to a system. Their uptake and use are essential for such a [digital ecosystem] to work or to be regarded as a sustainable solution.” (Herselman et al., 2016)
Viewing data as a shared resource to be curated together would also solve the problem that the current system is in effect lacking a true consent mechanism, since the initial consent is in practice, a handover of power that gives the care authority carte blanche to collect and use data about the individuals - a ‘point of severance’ (Luger and Rodden, 2013). In effect, the ongoing access to and direct use of data by families would serve as a practical implementation of a ‘dynamic consent’ model (Kaye et al., 2015; Williams et al., 2015); instead of consent being seen as the acquisition of a formal permission that has to be certified, stored, reviewed and modified, adopting simple practices such as talking families through their data and carrying out regular checks together could provide a practical but less bureaucratic guarantee that families are onboard with the way their data is being used, since their ongoing awareness combined with the absence of complaint can be taken as satisfaction. If implemented in a robust manner, this approach has the potential to greatly simplify the consent challenge for authorities, requiring simpler processing and reducing liability. Families will be happier with the use of their data if they can see it, notice issues and speak up when they feel something is amiss. Additionally the sharing of responsibility for data stewardship between both parties can reduce the liability for support workers; some were fearful of missing something important when given access to large amounts of families’ data – in this model where conversations are more focussed upon data, relevant information can be identified more quickly while at the same time mistakes can be spotted sooner; data becomes a resource that both parties make use of to inform their conversation, rather than the support worker’s sole responsibility. With families involved in checking and shaping their own data, that data can become more reliable and accurate, which goes some way to addressing the problems described by Cornford et al. of the state forcing families to be represented through data models that are not up to the task of representing the complexity of their lives (Cornford, Baines and Wilson, 2013). This need to give the user a role in understanding and influencing the life of their own data is identified as a key ingredient of moving towards a more progressive model of digital citizenship. In 2016, Bridle explained:
“If, instead of disempowering users in the name of simplicity and ease of use, we acted to empower them and ourselves through increased literacy in the technologies employed, and constructed systems where data about behaviour can be more easily quantified and controlled by the user, then we would have the tools at our disposal for a more equitable negotiation with commercial and governmental forms of power.” (Bridle, 2016)
Perhaps the greatest benefit to the care organisation of shared data interaction approaches would be the inclusion of supported families to a much greater degree as a stakeholder in their ‘case’. Instead of the care worker taking a position of authority, passing judgement and delivering advice, the care worker becomes an ally, with the family member(s) empowered as an agent in their own self-care, with a greater ability to take action and drive things forward than they had previously (see Theme 2); this is also a practical instance of the HDI concept of agency (Mortier et al., 2014), and in shifting the power balance toward the family it can also be seen as an antidote to current data-centrism in the system and society at large (see 2.1.2). Supported families would be able to trust that their interests are being looked out for and that through their ability to contribute to and access their ‘data self’, to take part in informed decisions that could improve their lives, and to use their data in new ways to serve their own ends.
Our findings reveal that the current inequality over families’ civic data will not be solved simply by opening up databases to families and giving them access. They must be able to meaningfully comprehend the data and meaningfully effect change based on what they learn from it. This involves the translation of raw data into meaningful information (see 2.1.1) – through summaries, visualisations and explanations – a need that we have identified even though the creation of information representations such would be challenging as it is not clear who would have the access, skills and mandate to do this. In the designs and desires of our participants we see confirmation that, as described in one of the central tenets of HDI, the information available to the individuals must be legible (see 2.3.2 and (Mortier et al., 2014)) but also that their access must be effective (see 2.1.4 and (Gurstein, 2011)). This includes providing suitable opportunities for access –- for example via personal data interfaces and not just within the support meetings -– as well as addressing technology, literacy, mental or physical handicaps. Our participants’ ideas around audio interfaces are a good example of the extra steps that would be needed to provide effective access for all. Supporting the range of all possible needs means that to be effect, information access must be supported by a human relationship –- one where someone can both explain the data as well as answer questions about it (see 4.3.2.4). It is the combination of effective data access and human-to-human interaction that makes data access meaningful, and the former without the latter will not empower the individual concerned; the storage of and access to data necessitates an ongoing conversation between data holder and data subject. The system needs to have a human face or point of contact that the individual may put their trust in and to whom they can address their questions; as others have noted, simply giving access to raw data would be inadequate and limiting (Cornford, Baines and Wilson, 2013).
By focussing on the human aspect of the proposed use of data within the support relationship, we can see that as well as improving accuracy, consent and trust, shared data interaction could bring practical benefits by facilitating a better interpersonal interaction. By physically bringing data into the interaction – be it a printout of a table or graph, or a tablet or 2-in-1 device – rather than just reporting it verbally, this representation serves as a focal point for discussion, bringing both parties to the same topic space faster and more efficiently than abstract discussion would. The data records here function as a boundary object (Star, 1989, 2010; Bowker et al., 2015), just as my Data Cards did within my own research. The families understand it because it relates to their life, and the support workers understand it because they are familiar with the systems it came from. As such, it can become a valuable tool for encouraging families to open up, even if only to query or challenge something at first. Many of our participants talked about how looking at data would provide a discussion stimulus or serve as a conversation starter. This initial use could lead on to using that data, as it changes from meeting to meeting, as a metric against which to measure progress, something which could bring a feeling of reward and accomplishment to the family and contribute to their future success. Also, it provides support workers an opportunity to be less adversarial, by positioning themselves as equals looking at the data together (‘let’s make sure this data is right’) rather than appearing as if they side with the data by being the ones who voice it (‘Our records say that you have…’). The effectiveness of having data representations as ‘things to think with’ that can establish common ground is discussed in our prior work (Bowyer et al., 2018) and is also echoed in the methods in this research (see 3.5.2). In particular in workshop C, which brought support workers and supported family members together, used storyboarding action cards in specific fictional scenarios. These cards provided a focal point for discussions and helped the participants to quickly imagine a realistic situation, again serving as boundary objects. The yellow (for families) and blue (for staff) borders on the cards helped ensure that both parties owned a piece of the puzzle: We had given no direction about who would place which cards, but we observed parents feeling confident to place yellow cards and support workers keen to place blue cards, because the card helped them identify with the corresponding role in the scenario and feel ownership over the choice of options that would be available to them. Similarly, the green bordered cards (which corresponded to those actions involve both parties) almost always resulted in both parties discussing and agreeing a view before the card was placed. If we relate this to an imagined discussion of actual data records, we can envisage that the presentation of the data as being “yours” or “ours” would have a noticeable effect upon how the families would engage with it, and the strength with which they would perceive the power of the data holder over them. This interchange within a research setting, gives some insight into how the dynamics of shared data interaction might work if implemented in practice. Having access to the data within the context of the support relationship is a key enabler of agency (Mortier et al., 2014) for the family members; an ability to interact with and correct or comment on the data directly would give them some agency that do not currently have, but in line with our findings that regular reviews of consent and data need to take place and that the ability to raise a question or start a conversation at any time is needed, we can consider that the availability of these capabilities on an ongoing basis would satisfy a second HDI requirement, negotiability. If there is no ability for their comments or corrections to the data to actually influence the support discussion and the work being done, then they have no negotiability - their data access is not really part of the system, it would be tangential to the actual support process. Therefore, efforts to deliver effective HDI capabilities in future should focus on interpersonal interaction, and the role of the human in the information system, as a data interface is limited by its operational context as to its ability to truly empower a data subject. Indeed, even the term ‘data subject’ which persists even in progressive data protection regimes (described in 2.1.3) embodies the prevalent problematic stance, evoking as it does imagery of a medieval king looking down upon his subservients). As our participants all strongly agreed, supported families ‘should be treated like people, not database records.’ [S4, see also 4.3.3.1]. This framing can inadvertently become problematic in early help practice focusing upon child welfare: ‘children [can be seen as] the objects of a variety of concerns which need to be acted upon rather than agents of their own lives’ (European Commission, 2014). Analysis of the Child Index, an early warning electronic information for child welfare in the Netherlands, drew a similar conclusion on the importance of maintaining a compassionate human aspect in family-state relations:
“Taking into account that [care] professionals’ first love is the best interest of and care for a child, it is recommended for policymakers to provide enough room for the ‘love’ between future technologies and their social actors to flourish.” (Lecluijze et al., 2015)
In pursuit of RQ2, the four Case Study One workshops and the preliminary research have explored the role of data within the Early Help support relationship (see 2.2.5), looking separately at family and staff perspectives before bringing both parties together to discuss how both parties’ goals might be served by a model of shared data interaction. In workshop C, we explored the mechanics of shared data interaction at a interpersonal, sociotechnical level (see 2.3.3), mapping out a possible narrative in terms of human-human and human-data interactions. I present here a model for understanding why this could be important for rebalancing power between the supported family and the state, based upon a concept I have developed called shifting the locus of decision-making’ (LDM), This concept is distinct from locus of control (Spector, 1982) which normally refers to personal willpower, and locus of power, which refers to the concentration of power within an organizational hierarchy. LDM refers to the place where decisions are made, and it may or may not coincide with existing authority structures. A pattern can be stipulated, in which decisions are typically made, germinated or championed close to where data is accessed. In an effect that has been expected since as early as 1970 (Klatzky, 1970) the increasing use of data in services across private and public sectors (a phenomenon detailed in 2.1.2) has concentrated the LDM with data holders, who collect service users’ data to serve their own purposes.
[TODO: Replace with higher resolution version]
The current and imagined approaches are shown in Figure 22 above. In the current model (left), all access to data by families is through the support worker as gatekeeper, who decides the scope, content and nature of their access – here the LDM is effectively locked away from the family’s participation. The use of data by families is limited because any data must flow through the support worker as gatekeeper. In a more equitable model (right), both support worker and family member are positioned as allies looking at the data together. This model changes the nature of the support relationship, as some of the work that was previously done solely in the domain of the data holder (specifically, data maintenance and the direct use of data to inform judgements and plans) would now take place in a different context – the two-party context of the support meeting itself. The removal of the gatekeeper role redistributes the power to interpret, select and judge data much more equitably between the two parties; families would no longer be prevented from participating in data-based decision-making. I theorise that shifting the data access from the domain of the support worker to the shared domain of the meeting between the two parties, would therefore move the LDM closer to the middle of the relationship, where it will rest at the heart of the support relationship, creating a more balanced relationship and increasing families’ agency and power. Within the findings above we see evidence that both families and staff would value a shared data interaction approach, with multiple participants independently suggesting potential benefits that could be gained by techniques such as reviewing data and consent together (4.3.2.2, 4.3.3.2, and 4.3.3.3). While participants perceive shared data interaction as an improvement, such an approach has not been tested in practice, so it is important to consider what the benefits and implications of such a shift might be:
The potential benefits in terms of empowering families are significant. As detailed above it would give them a role to play as agents in the life of their data, and a new ability to create and curate their own ‘data self’ – the representation of them that is seen by the state – so that it is as fair, accurate and representative as possible (Bowyer et al., 2018). But more than that, given the increased visibility of the metrics by which their progress is judged, they are now empowered to take steps to influence any poorer metrics by making improvements in their own life that would result in those metrics improving visibly, which then could then use as evidence to prove their achievements – a positive feedback cycle that was only indirectly possible, if at all. By shifting the locus of decision-making, families could take more responsibility for their own lives, through an increased ability to reflect and make plans – an important element of harnessing one’s personal data for self-improvement (see 2.2.3 and (Abiteboul, André and Kaplan, 2015)), thus ‘encouraging the family to take full accountability for their own responsibilities’ as one support worker put it [SQ75]. In their 2016 paper, Crabtree and Mortier also recognise the importance of exposing individuals to actual data if accountability is to be achieved (Crabtree and Mortier, 2016). The perceived benefit of individuals directly using data-based interfaces for health and wellbeing are already accepted, with 93% of doctors believing that apps can improve health outcomes (Kostkova, 2015).
The above are benefits to the supported individual, which of course can be seen as benefits to the care provider as well, given that the function of the early help service is to help the supported family improve their situation as effectively as possible. But shifting the LDM also carries practical benefits for the care provider too: If the family are involved in the stewardship of their data, this reduces the burden and responsibility upon the authority to look after that data – instead, the responsibility for ensuring completeness, accuracy and fairness is now a shared responsibility. And if responsibility is shared, this must surely also reduce the likelihood of complaints or litigation, because it can transform the way that families think of the care provider away from ‘us and them’ thinking towards a more equitable stance. An additional advantage of a cooperative approach to data stewardship is that provided the data subject remains engaged, informed and understands the data and processes that exist, the consent problem is solved; the scope for non-consent is reduced because at every single meeting (and perhaps even outside those meetings if individual personal data interfaces are available) the supported families are involved in a conversation that directly enables them to voice their approval or concerns for the ways their data is being used.
However, implementing such a change to the system would not be without its challenges. There would be significant costs: New equipment such as tablets or 2-in-1 devices might need to be purchased if support workers do not already have these. New software interfaces would need to be commissioned, developed and purchased. The existing configuration of IT systems in the public sector (see section 4.1.2) is not well-suited to the creation of such unified data interfaces due to its fragmented nature (Copeland, 2015). Identity management in this context is already very challenging to negotiate (Wilson et al., 2011). Support workers would need additional training both on software and hardware. The need to increase digital skills across health and social care has already been identified as a current issue in the UK (Honeyman, Dunn and Mckenna, 2016) and in other countries such as Poland where it is deemed critical (Soja, 2015). This will become particularly important in a system where the care workers are also the ones who would be helping individuals to make sense of digital information. The use of computer-based communication and information approaches would need particular care with child welfare (Tregeagle and Darcy, 2008). Local authority business processes would need significant overhauls to recognise the individual members of the public as an important part of the system – which would likely carry with it new considerations for system access controls, technical support and public liability insurance. In particular the provision of personal data interfaces to the public, and new communication channels for public enquiry, would carry with it a large human resource burden to manage and support those channels and usages. While the creation of a direct communication channel between supported individuals and support services does on the face of it have the potential to carry some savings for the state in terms of reducing the amount of ‘in-the-home’ contact necessary –- which is particularly challenging and costly to deliver in rural areas far from major towns (Kriisk and Minas, 2017) –- the idea of the data access being supported by human contact, and of making more decisions together, may ultimately require a greater investment of manpower in communicating with supported families. Measures would have to be put in place for when things go wrong: dispute resolution procedures and additional legal and information governance support would be likely to be needed. It is also possible that giving more power to families could create new challenges: it is not impossible that particular individuals, for whatever motivation, might try to be destructive, manipulative or otherwise challenging to the system, and they might try and use their new powers against the state (for example, hiding criminal activity or misleading workers for personal gain). While very unlikely to be a mainstream issue, this is a fringe possibility that any process or system must still consider and planned for. It would be fair to criticise this model of human-centred state interaction in that it would be not be cheap or scalable; in essence this model creates mechanisms for families to have more interactions with the state, which means that every case would take more worker time in a system that is already overburdened and underfunded [Copeland (2015);ADD REF Local Government Association]. The state has increasingly adopted a data-centric approach to citizen interaction in part because it cannot manage to provide human relationships with every individual citizen. But now this approach has become ingrained into government approaches to citizen relations –- ’it is no longer a technological necessity but it has become a political intention" (Bridle, 2016). What we have identified is that there is a need to reverse this trend, not just in practice but in political ambition, if people’s interests are to be best served, and if a welfare state is to be truly enabling (Miettinen, 2013). By taking a more innovative approach to digital policy, it is possible that governments could be more effective in helping to involve those citizens that have become disadvantaged by the current system – a more human-centred approach could help to combat the digital divide (Kalvet, 2005; Steyaert and Gould, 2009).
My model that shifts the LDM is theoretical; it does not yet provide an implementable solution that could be rolled out at scale, rather it should be thought of as useful mental model to stimulate further discussion about how care providers could or should change their processes and systems. The value of this contribution is that it shines a light on the positive and negative impacts of current data-handling and data-use procedures upon relationship effectiveness, and identifies imagined practices that could be preferable and more efficient than current practice. The findings serve as a challenge to the status quo, that should encourage early help providers to question their priorities when it comes to the use of people’s civic data in pursuit of the primary goal of Early Help; to empower families to help themselves as effectively as possible.
Through four participatory co-design workshops with supported families and support workers in North-East England, I have highlighted five major problem areas which our participants perceive to exist with current personal data practices:
A power imbalance – Families’ personal civic data is collected by care organisations and viewed as a resource to be utilised by the support workers, creating a structural power imbalance against families which is further emphasised by the authority, influence and network centrality of the support service with each family’s data landscape.
A closed and opaque data ecosystem – Families lack awareness of what data is held about them and how it is used, with support workers (who themselves have limits to their access) functioning as gatekeepers to what families will be told about.
Ineffective, meaningless consent – The current consent model, while legally satisfactorily, is ineffective, as it is viewed as a one-time initial hurdle after which support workers can do whatever they deem necessary with families’ data and those families are never again given any meaningful choices about what happens to their data.
No accountability and fragile, limited trust – Without any transparency or ability to request or demand changes to data or data practices, families have no ability to hold data handlers to account. The lack of visibility makes families’ trust in the system hard to earn and fragile to maintain.
A lack of agency or true empowerment – With families having no ability to shape the way they are represented in data or even just to see themselves in data as the state sees them, opportunities are missed to truly empower families to be better represented and to better themselves.
Through these explorations of shared data interaction and personal data interaction, I have shown there is both a need and a desire for a new approach. A model in which support services are deliberately open with families’ data and bring it to the heart of their face-to-face consultations could address all five of these problems. The removal of the gatekeeper role over families’ civic data would shift the power balance towards the family as it would give them a role in the stewardship of their own data. Providing families with a transparent view of stored data, and with clear visibility of data recording and usage, would enable accountability, which has previously been absent, which in turn could help to improve trust. With the family involved at every stage and able to see their data at any time, the consent problem would be largely solved – because families would be able to immediately speak up at any point should their wishes change in the light of new developments or new information. With the family becoming truly involved in data-informed support conversations that can make better decisions, and being more able to influence the way they are represented, they would be more empowered to make changes in their own lives and could achieve a previously unattainable level of agency.
Further benefits of a shared data interaction approach have also been uncovered; data visualisations and summaries could be very effective as conversation starters and as boundary objects, potentially leading to more effective conversations. The ability to reference specific data points over time can provide an objective measure against which to track progress – whose primary value is not to the support organisations (where they are currently used to measure service effectiveness) but in fact to the families themselves, who are now able to directly see the effects of their own actions in their data, much like the reflection capabilities we see in the self-informatics space. The shift from support workers reporting what the data says to ‘looking at data together’ would help to shift the dynamic of the support interaction away from ‘us and them’ thinking towards a more collaborative approach and would be less adversarial. The inclusion of individuals in the stewardship of their own data would lead to more accurate data, because in reality the truth lies somewhere between what the data says and the family’s own perspective, and can only emerge through a combination of data and dialogue. Individual family members would be able to notice mistakes or gaps, and contribute explanations, context or additional data to enrich the picture. By ensuring the discussions are based on data that is as accurate as possible, the quality of decision-making would naturally improve and conversations would be likely to be more effective and efficient as they would be more grounded in reality.
In particular, we have shown that giving the family a role could be very powerful, because the ability to contribute their own data or have visibility of data recording would provide them with direct evidence that they are being listened to and that their perspective is seen to matter more than ‘what the computer says’. The ability to ask questions about their data, and to explain or clarify things seen in the data, treats the family with more respect than the purely data-and-technology-based approach of the state-citizen service infrastructure experienced on the whole by non-supported families. The ability to act independently, in their own time and in contexts outside of the support interaction, would allow individuals to alleviate concerns quickly and maintain confidence that their data selves, the version of themselves used by the state to inform decisions, remain fair and accurate, but also to open up new opportunities to individuals for using their data for their own ends in ways that were not previously possible. It is through the adoption of such measures that we could begin to facilitate the emergence of a human-centred personal data ecosystem (as described in 2.3.4) in a civic context.
In exploring the usage of civic data in its full sociotechnical context, not just from the provider’s perspective or citizen’s perspective, we have shown that merely providing people with access to data would be insufficient to properly address the identified problems, and that Human-Data Interaction itself needs to be developed as a concept. As a sub-field of Human-Computer Interaction, HDI is largely considered in the traditional context of interacting with data through an interface, but this work, which has, guided by our participants, focused less upon layout and screen interaction and more upon the wider sociotechnical context of the support relationship, suggests that HDI can be more effective when the word ‘interaction’ is considered in an interpersonal sense, and these insights begin to address the research gap identified in 2.3.5, to define the research agenda for human centricity in practice. Informed in part by this idea I have explored further in a workshop paper how the HDI field needs to advance to consider the sociotechnical level as well as the interface level, which is outlined in (Bowyer, 2021).
Capabilities – or their absence – matter more than the on-screen technicalities of the data interaction. Data interfaces are limited by their operating context as to how much they can offer, but considering data interaction as a sociotechnical process, including the wider human-facing relationship between the individual and the representative of the state as well as the data interface itself, allows us to imagine a more holistic solution that can better address any situation arising. It is vital that the human perspective be given the highest priority, so that professionals’ flexibility is not limited, but also because data cannot adequately represent the complexities of human life – people are more than just data, and you have to talk to them to make sense of their lives and to avoid excluding them. The usage of data must always be supported with dialogue and engagement. It is the need to focus on the human aspect that explains why trust underpinned nearly every single problem imagined by our participants – without an open system that encourages dialogue and discussion it is very hard not to close doors, create suspicion and harm trust.
Through the sentence ranking exercises I have been able to gather a snapshot overview of what this sample of support workers and supported families think about data, and where they agree and disagree (see Figure 21). Tje detailed analysis of workshop transcripts has provided an understanding of the positive and negative impacts on the support relationship of current civic data practices within early help, and through our qualitative analysis we have been able to identify best practices, seen in the subthemes of sections 4.3.2, 4.3.3 and 4.3.4 and expressed in our CHI 2019 publication as 38 specific practices for Early Help services (Bowyer et al., 2019), many of which are currently imagined or only just emerging. Participants believe these best practices would improve families engagement and the support they receive. These suggestions can serve as a challenge to the status quo that could inform policymakers attempting to reform care services or digital citizenship offerings. There would be significant challenges in adopting our proposed changes, in cost, training, manpower and emergency planning, as with any systemic practice change in an organisation, but such an approach may get closer to the heart of the real issue of empowering ‘left-behind’ (disempowered) families than a purely state-centred approach to problem solving, and that this may offer part of a route to a more enabling welfare state. More generally this work serves as a reminder that as we move into the data-driven age it is important that data should stay close to the people it is about, rather than to those that use the data to provide services, and that service practice and processes should remain human-centric rather than data-centric.
The general principles expressed here could be equally applied to other domains including education, healthcare, democracy and commerce, and this emphasis upon individual capability over interface design is a useful mindset that could be applied to many human-computer interaction and design endeavours.
In this chapter, I will describe the second major case study of this PhD, in which I took 1110 participants through an longitudinal in-depth one-on-one process of three interviews with coaching and support in between, with the total engagement per participant lasting approximately 4 hours over a three month period. The purpose of the research was gain a deeper understanding of people’s attitudes to the kinds of personal data held by companies in people’s everyday lives and what they want from that data (in pursuit of RQ1) and specifically to examine the human experience of existing in a data-centric world (see 2.1), with each individual having a number of relationships with service providers that involve the use and holding of personal data; in line with RQ2 the goal is to better understand the role of that data in those relationships. In particular, having gained an initial understand of attitudes, hopes and expectations, a further objective was to examine how those expectations might change during the journey of digital life mapping, data request making, receiving and examining of data, and scrutiny of responses, collectively forming a holistic understanding of “the human experience of accessing your data with GDPR.”
In section 5.1, I will expand on chapter 2 to explain the context of using GDPR in research as a means to retrieve personal data. In 5.2, I will explain the stages of the interview process (including details of how participants were sensitised) as well as the preparatory and intermediate steps I undertook as researcher. In section 5.3, I will explain the model of personal data types developed for this study, and will present quantitative and summary data from the interviews, explaining how participants’ GDPR access requests progressed, highlighting participants’ shared hopes and goals, and examining in particular how their perceptions of power and trust were affected by the experience. In section 5.4, I will describe the three themes uncovered through thematic analysis: that organisations provided participants with insufficient transparency to meet participants’ hopes and their legal obligations (5.4.1), that people struggle to find meaning and value in their data when they do manage to access it (5.4.2), and that providers’ data practices (in particular their GDPR request handling) can be harmful to their users’ trust, but that greater openness can have an opposite, positive impact (5.4.3). I will discuss the implications of these findings with reference to prior literature, from the perspective of policymakers (5.5.1), data-holding companies (5.5.2), and individuals (5.5.3). Finally in 5.6, I will summarise these insights in terms of how they can advance our understanding of the research questions and their wider significance.
As established in 2.1.2 and 2.2.4, people live digital lives, inevitably involving the use of myriad digital services that collect personal data, which is subsequently mined for value and exploited at scale, creating an imbalance of power between data holders and data subjects, and a exclusionary landscape around data use which is difficult for individuals to navigate: having acquired data about individuals, this becomes a focus for service providers’ decision-making and customer relations become less important. This everyday context is the chosen research setting for this case study.
Section 2.1.4 established how unaware many people are of this imbalance around data, that there is a want11 for effective access to data to restore individual agency. As described in section 2.1.3, policymakers have been attempting since the 1970s to introduce legislation to tilt the balance of power back towards individuals, most recently and most notably the European Union’s General Data Protection Regulation, which legally endows at least 513 million individuals12 with new rights to timely data access, explanation, erasure and correction (Information Commissioner’s Office, 2018).
Data protection and misuse issues have grown in the public awareness since the Snowden revelations in 2013 (Gellman, 2013), and have become even more important following the Cambridge Analytica scandal in 2018 (‘Facebook–Cambridge Analytica Data Scandal’, 2014; Chang, 2018), which may have resulted in manipulation of voting outcomes through personal data use, and the COVID-19 pandemic in 2021 (O’Donnell, 2020; Hamon et al., 2021). Since the GDPR’s launch in May 2018, it has undoubtedly resulted in new data access offerings; many large consumer companies have developed ‘privacy hubs’ or improved privacy policies where individuals can learn how their personal data is handled or access data download portals to easily download copies of it (‘Privacy - Apple (UK)’, no date; ‘Privacy & Terms – Google’, no date; ‘Privacy’, no date; ‘Facebook - Data Policy’, no date). Almost all data controllers and processors have now updated their privacy policies to include clear processes for data subjects to request copies of their personal data per their GDPR access rights.
However, it is not known how effective these offerings and processes are for service users, and how individuals feel about them in light of this backdrop of public concern. No service providers make data access statistics publicly available, but anecdotal reports from industry insiders suggest GDPR access rights and data download dashboards are not well-known and hardly used. This presents an opportunity to take individuals who have not previously used these capabilities on a journey of discovery that might enable us to assess the impact of these processes over time and whether–by compelling data holders to create such offerings and respond to access requests–GDPR succeeds in its goals to ‘enhance the data protection rights of individuals’ (Council of the European Union, 2015) and to give people ‘control over their personal data’ (The European Parliament and the Council of the European Union, 2016b).
Since it came into effect in May 2018, the GDPR has opened up new possibilities for research (Comandè and Schneider, 2021); the ability to obtain one’s data records from organisations provides the general public with a potential deeper view inside those organisations, much like the UK’s Freedom of Information Act has provided a view into governmental and public sector organisations, enabling research and improving accountability (Savage and Hyde, 2014). Such legally-enforced transparency can also provide researchers with a window into organisations and their processes that was previously only available based on goodwill. Ausloos and Veale (Ausloos, 2019; Ausloos and Veale, 2020) provide an outline approach for using the GDPR in research as well as describing the many ethical and metholodogical considerations that should be made. GDPR research can however be as simple as inviting participants to exercise their rights of access and talking to them about the experience and any changes in their perspective, which is the approach this study uses, as detailed below.
The GDPR process itself has also been examined from many perspectives by researchers: to understand data holder’s compliance with legislation (Ausloos and Dewitte, 2018; Arfelt, Basin and Debois, 2019); to evaluate data portability (Wong and Henderson, 2018) and ‘privacy by design’ (Waldman, 2020); to compare its effectiveness in public/private sector contexts (Quinn, 2021) or in improving explainability (Hamon et al., 2021), fairness (Kasirzadeh and Clifford, 2021), consent (Human and Cech, 2021), transparency (Spagnuelo, Ferreira and Lenzini, 2019) and the reduction of data breach risks (Gonscherowski and Bieker, 2018). Potential negative impacts have also been considered; the GDPR could be seen as a threat to privacy (Bufalieri et al., 2020) or as an impediment to health research (Clarke et al., 2019).
Clearly GDPR has spurred a broad variety of research, spanning legal, social and technology domains. Yet, there is scant research into the individual human experience of the GDPR. Alizadeh et al. conducted a study with 13 users of a German loyalty programme and interviewed them before, during and after they made GDPR data requests (Alizadeh et al., 2019), finding better responses and GDPR education were needed. This is a good example of the sort of work that is needed to explore the human perspective on the GDPR journey, though this particular study was limited in breadth (only one service provider was targeted) and in depth (the data returned from companies was discussed largely at a high level of ‘were your expectations met?’ and potential to use the data for one’s own benefits was not examined). The implications of the experience upon the participants’ relationship with the provider were also not explored; it seems that impacts of data handling practice upon relationships is an under-researched area in general. Recent work (Bufalieri et al., 2020; Glavic et al., 2021; Zuckerman, 2021) has established that openness and transparency around data handling are key to services establishing individuals’ trust; indeed an echo of this was seen in a public sector context in Case Study One (see Chapter 4). In a commercial context, such changes in trust can impact customer satisfaction and business success.
At a more fundamental level, there is a need to understand the experience people have when using the GDPR; companies’ GDPR processes have been designed to comply with litigation rather than by focusing on individual needs or desires (Abowd and Mynatt, 2000; McCarthy and Wright, 2004; Wright and McCarthy, 2008) (for more details on experience-centred design refer to section 3.2). It is highly likely that some of these will have been overlooked. Such experiential understanding could inform the design of improvements to companies’ GDPR mechanisms, as well as identifying specific needs that might be best met through improvements to policy, including to the GDPR itself.
Given the fact that data-centric services now span all aspects of our lives, and the amount of personal data about individuals has grown, it has become critical to think about the way people interact with data as a ‘whole life’ problem. This is one of the reasons this study focuses on the layman rather than a particular demographic, and ‘everyday services’ rather than a particular domain. Data has transcended the machine and now encodes facts about our lives, it exists across devices and across providers (Weiser, 1991; Mydex CIC, 2010; Abowd, 2012). This means that personal information management has become a sociotechnical problem (see section 2.3.3), that can no longer be solved as a filing-and-retrieval problem as per traditional PIM approaches (see 2.2.2), but only when considered as multi-party negotiation over representation, ownership, access and consent. It is important to evaluate the GDPR in this context. Up to now, individuals have not had the means to participate in or initiate such negotiations. On paper, it would seem that GDPR rights do convey this capability, but it is not known whether in practice, service providers’ responses to GDPR can actually deliver data subjects the ability to take part in negotiations around data in a fully-informed way. While some research on relationships around data and data as a shared resource is now emerging (see 2.2.5), the relationship with data-holding service providers has not been examined in this way.
A roadmap for best practice in this space can be found in the emergence of the ‘personal data ecosystem’ concept (see 2.3.4). Researchers have identified that a human-centric approach to personal data is needed, placing individuals at the centre, as controllers and overseers of their own personal data (Mydex CIC, 2010; Symons et al., 2017). This is an emergent space of much activity and research (‘Human Data Interaction Project at the Data to AI Lab, MIT’, 2015; ‘HDI Network Plus, University of Glasgow’, 2018; ‘HDI Lab, Heerlen’, 2020; BBC R&D, 2017; MyData, 2017; Symons et al., 2017; MyData.org, 2018) and provides a strong framing for us to evaluate the human experience of–and interaction with–the GDPR; given people’s diminished agency and control over their data (Woolgar, 2014; Crabtree and Mortier, 2016), do the GDPR’s access rights, as implemented by service providers, provide the effective access (Gurstein, 2011) people need? Does the GDPR help people to achieve legibility, agency and negotiability, the three tenets of Human-Data Interaction (see section 2.3.2 and (Mortier et al., 2014)).
This case study aims to explore the research gap identified in 5.1.2 above, from this perspective of greater human-centric need in a sociotechnical multi-party data use context. It will do so by scrutinizing the experience of using one’s GDPR rights to discover how well the process meets individuals’ needs and expectations; in the process the object is to uncover problems and identify possible solutions that could address them.
To address these research objectives, 31 qualitative interviews were conducted, with a convenience sample of 11 individuals from a population of researchers and students at (or connected with) Newcastle University, aged 20-40 years; self-identifying as 5 females and 6 males. Participants were not data experts (only 1 had previously made a GDPR request), but were computer-literate, educated to degree level, and used to reflecting critically on their own behaviours and opinions. Participants were compensated for their time with Amazon vouchers worth £20.
Each participant’s journey progressed at its own pace (see Figure 23) with participants invited to three separate 1-on-1 interviews between December 2019 and April 2020. The scope and purpose of each interview was as follows:
Interview 1: Sensitisation, Life Exploration and Company Selection [1 hour, in person]. Participants were sensitised to the research context using an interactive tour of a poster display on the topics of GDPR rights, potential data-holding organisations, potential types of data and potential uses for GDPR-obtained data. Baseline data was collected on participants’ hopes and motivations, their current understanding of personal data, data access, data control, and power as it relates to data. Using a sketch interviewing (Hwang, 2021) technique, participants mapped out their ‘data lives’ (e.g. Figure 24), annotating key organisations that they have relationships with, types of data those companies might hold, and feelings about such data use and storage by each holder. Each participant selected 3-5 candidate companies to explore with GDPR requests.
Interview 2: Privacy Policy Reviewing, Goal Setting and GDPR Request Initiation [1 hour, in person]. To stimulate reflective thinking and measure impacts, participants were asked to discuss and score their initial feelings of trust and power with each company. Participants then viewed key sections of privacy policies on a screen with the researcher, to identify each company’s statements on collection and use of personal data. Participants then initiated an email GDPR request for each company, which had been prepared using a tried-and-tested template generated by personaldata.io (Wiki.personaldata.io, no date). Interview 2 took place in person, except for P10 & P11 whose interviews took place over Zoom due to the COVID-19 pandemic.
Interview 3: Detailed GDPR Response Review [2 hours, online video call] Having allowed sufficient time for GDPR requests to conclude (there is a legal duty to reply within 30 days), a deep dive into the specifics of each GDPR experience took place. Participants’ personal data was not collected by the researcher, only described verbally; screen sharing was used to show excerpts to the researcher where the participant wished to do so. Participants were asked a structured set of questions about the completeness and value of any data returned, as well as new evaluations of trust and power, whether their hopes had been met, and any general feelings about the experience. Answers were recorded in a screen-shared spreadsheet, which was also used to structure the discussion (for a sample see [INSERT REF TO APPENDIX]).
Interviews were audio and video recorded, then auto-transcribed using Google Recorder/Zoom, producing a 370,000-word corpus. Transcripts were split up by topic and analysed through reductive coding cycles to produce thematic findings (see 5.4). Quantitative data from interview spreadsheets was summarised and analysed (see 5.5). Sketches, recordings, screenshots and field notes were referenced throughout thematic analysis to aid interpretation of the transcripts.
Initially eight participants chose 5 target companies and three chose 4 to request data from. One participant (P9) withdrew from the study due to COVID-19 after Interview 1. Five participants withdrew a chosen company upon further consideration. Reasons for withdrawing chosen targets included having one’s personal data mixed with other household members (Netflix), the account being in someone else’s name (Morrisons), not wishing to impact active customer support matters (LNER), and inability to contact the provider by email (ifun.tv, see below). One participant selected Newcastle University, which was vetoed by the research team to avoid conflicts of interest. Hence, 41 out of a possible 52 GDPR subject access requests were made (to 28 distinct data holders) as shown in Table 8:
Table: Table 8. Types of Data Holding Organisation Targeted for GDPR Requests by Study Participantsa
To ensure fairness and consistency, the aim was that all GDPR requests be sent by e-mail to the identified Data Protection Officer, requesting both a subject access request (Information Commissioner’s Office, 2021a) and a data portability request (Information Commissioner’s Office, 2021b) be initiated, and specifically enumerating and asking for those datapoints that the company stated in its privacy policy, as well as those which the GDPR entitles individuals to obtain. To identify these datapoints, company privacy policies were analysed and the necessary information was compiled in personaldata.io’s semantic wiki (‘List of target companies for GDPR requests’, no date). This has a feature to generate bespoke GDPR request emails, which were adapted then provided to participants. (INSERT APPENDIX REF). Facebook, Apple, Huawei and Philips Hue do not offer a contact e-mail address, so the email text (shortened where length restrictions applied) was pasted into a contact form. In one case, entertainment website ifun.tv, the only available means of contact was via WeChat, resulting in the participant (a Chinese citizen) choosing not to contact ifun.tv due to fear of Chinese government surveillance. Through analysis of companies’ privacy policies and with reference to GDPR rights, a taxonomy of the types of personal data that could be returned was constructed, using terms from those privacy policies and GDPR legislation: there are five types of personal data, as shown in Table 9.
Table: Table 9. Types of Personal Data Potentially Accessible from Data Holders via GDPR Rights
Participants reviewed and discussed privacy policies for their chosen target companies and were asked to define hopes and expectations for each GDPR request (see Table 12). 74% of goals expresses related to participants wanting to have greater insight and control into their personal data ecosystems; most commonly a desire to see the breadth and depth of data collection by companies, to understand what was being inferred and how personal data was used, and to use such information to better assess trustworthiness of those companies. Such goals were often motivated by curiosity or suspicion, or a desire to shed light on specific incidents or answer specific questions. In some cases participants wanted not just to learn and acquire knowledge but to take control of or delete held data. In contrast, 26% of goals related to gaining personal benefit from one’s obtained data: motivators included the desire to reflect on past data to gain self-insight, as well as goals relating creativity, fun, and nostalgia.
At the conclusion of interview 2, participants were provided with the emails and instructions to start their GDPR requests, which progressed as illustrated in Figure 25. Eight requests resulted in no data being obtained, due to either data holder non-responsiveness, inability to access the right account or satisfy ID requirements, or confirmation being received that there was no data to supply. 32 requests (80%) resulted in at least some data being returned; 10 of these directed the participant to use a publically-available download dashboard such as Google Takeout, and the rest resulted in data being made individually available. Of these, one was mailed as printouts, another was mailed on CD-R, and the rest were delivered by e-mail (sometimes involving a secured online website to download). While 22 companies supplied bespoke data packages, 4 did not return it within the 30 days the legislation specifies (note: requests took place within the context of a global pandemic so response rates may not be typical). Following discussion, participants judged that all 32 requests receiving data had failed to return all requested data (across all five of the categories in Table 9).
Once each participant’s GPDR requests had reached a conclusion point (as described above), they were invited to discuss the GDPR response in detail. Participants were asked to describe (and optionally show) the data they had received, then to evaluate the data holder’s response for each data type, according to multiple metrics designed to assess the perceived quality of the GDPR request handling and the subjective value of any returned data. All questions were posed from the perspective of (a) the data that providers said they collect and process in their privacy policies, and (b) the rights that the GDPR specifies, to ensure discovery of missing data or unfulfilled rights would be considered objectively. Participant responses were considered quantitatively (see Table 10) and qualitatively (see section 5.4).
Table: Table 10. Presence and quality assessments of GDPR responses by data type (as percentagesa)
Table 10 shows quality assessments for each data type, with rows descending by subjective value. Notably, the kinds of data participants value most (derived, acquired and metadata) were less frequently returned, especially metadata (returned in 4% of cases). Where data was returned in these categories, it suffered from poor data quality, often judged as incomplete, inaccurate, unusable and not useful (although acquired data was largely understandable). At 53%, even the most returned category, volunteered data, was lacking. Where it was returned, accuracy (92%), meaningfulness (72%) and understandability (72%) were high. Observed data was least valued and also rarely returned or complete (yet judged to be of moderate quality). Across all data types, data was only judged to be complete in 22% of cases, and in 62% of cases personal data specified in privacy policies to be collected was not returned, despite the legal obligation.
The above quality and coverage datapoints also allowed us to extract some information about which service providers were strongest or weakest in each category, and overall. This was done by tallying the “Yes” responses for each category and overall, then dividing by the number of times that provider was selected, to avoid inflating scores for popular companies. The outcome of this analysis is shown in Table 11. The companies that fared worst overall were those that did not return any data at all in response to a GDPR request (Sainsbury’s, Freeprints, Tyne Tunnels, LinkedIn, Huawei, Bumble, LNER). As a caveat, it should be noted that Sainsbury’s and Huawei did respond, claiming to hold no data for the requesting participant. The other named companies here did not respond at all, despite at least two follow-up emails being sent to them, and despite in some cases having initially acknowledged and promised to satisfy the request.
Companies producing responses with good coverage and good quality included Niantic, Nectar and Sunderland AFC as well as to a lesser extent Natural Cycles, Revolut, Spotify, Tesco and Amazon. Facebook and Google fared well for the breadth of data returned (due in part to their download dashboards), though the quality of Google’s data was found lacking across multiple categories. Last.fm (owned by CBS) fared poorly overall due to poor category coverage, despite the data that it did return being of high quality.
Table: Table 11. Best and Worst Data Holders in Different Categories, According to Participants’ Judgementsa
At the conclusion of the final interview, participants were reminded of the specific hopes and anticipated data uses they had expressed at the start of their journey and asked about how well each goal had been met. These answers were recorded and combined to produce percentage values showing in how many cases goals were fully met, partially met, or not met at all, as shown in Table 12.
Participants felt their goals were not fully met in 78% of cases, and 54% were not met at all. Specific shared problem areas included (1) the desire to understand what providers infer from held data (7 participants), which was unmet in 73% of cases and only fully met in 7% of cases; and (2) the desire to delete one’s data, which was a stated goal in 10 cases but was only met in one of them. Four wholly unmet hopes were to investigate specific incidents (GDPR responses were often delivered as a one-off package without any kind of backchannel or opportunity to ask questions), to secure data, to check accuracy, and to move data to another service.
Table: Table 12. Participants’ hopes, imagined data uses and goals for GDPR, as well as resultant outcomes
Repeating scoring questions were used to examine how participants’ feelings towards the data holders changed throughout the process: Participants were asked to assess trust from 0 (total distrust) to 10 (total trust), and to assess their perceived power on a scale of -5 (total provider power) through 0 (balanced power) to +5 (total individual power). Explanations and reasoning for initial ratings and for any changes were uncovered through questioning. By repeating the same question at different times, longitudinal comparisons could be made. Many participants’ attitudes did change as a result of the experience (as summarized in Figure 26), for both perceived power (45% of cases) and trust (66% of cases). For those with changed attitudes, the change was often negative: in 63% of cases where participants perceived a change in individual power, that change was a loss in individual power, and in the majority (52%) of cases, participants felt more distrustful of GDPR targeted companies after completing the process (constituting 79% of cases where a change in trust was perceived). However, it is important to note that in some cases GDPR had a positive impact; in 17% of cases participants felt their perceived power had increased, and in 14% of cases participants felt more trusting of providers after GDPR.
Looking deeper into these datapoints, changes in attitude could be attributed both to the impact of reviewing the privacy policy as well as to the experience of the GDPR process and the discursive review of GDPR responses. Figures 27 and 28 show snapshots of power and trust ratings at different points in the process which illustrate these impacts. Looking to explain these changes qualitatively, it was found that privacy policies often contradicted participants’ expectations, resulting in discomfort. In two cases (Philips Hue and last.fm) privacy policy review revealed that the service relationship was with a completely different company than the participant thought, which was disturbing to them. LinkedIn’s privacy policy was noteworthy as being exceptionally clear, reassuring and trust-enhancing to the participant, largely due to its ‘easy read’ text sidebars but also good use of examples. However it does appear that simplifying privacy policies can go too far: Google’s privacy hub (which includes video explainers) was considered easy to understand but necessarily broad (given their breadth of services) and thus over-simplified, raising uncertainty about generalisations made, and in some cases increasing distrust.
Considering the process as a whole, participants’ attitudes were impacted particularly by the “hassle” (P11) they experienced in getting through the data access process, and from the realization that what seemed at first glance to be a thorough response, when scrutinised more closely in Interview 3 and viewed through the lens of the privacy policy promises and one’s GDPR rights to the five categories of data, was in fact quite poor.
[description like in c4 plus include some of following] Here I present outcomes from a deeper analysis of the participant experiences summarized above, derived through over 200 person-hours of iterative data analysis [76] of the interview transcripts. The three key thematic findings are:
[convert this to prose]
[insert Table for each of the three themes]
| Subtheme | Description | Quote |
|---|---|---|
| A Desire for Awareness and Understanding | Participants want to see, know and understand the data held about them. There was particular interest to see data collected or inferred about them without their involvement, and to understand how data is used and shared and how that might affect them. | “[Companies have more power] because they’re making decisions about things and you don’t know how they’re making those decisions.” [P5] |
| Non-Compliance Without Consequence | Many providers failed to provide data on time or at all. In 100% of cases, returned data was incomplete, and many viewed this as non-compliance. Data holders’ freedom to disobey legislation was attributed to a lack of enforcement and seen as an exertion of power. | “I am surprised at Google’s unwillingness to provide me with all of the data … they haven’t provided me with all of my data. And that’s not legal.” [P7] |
| Inadequate Data Responses | Participants judged data holders to be unhelpful, GDPR procedures to be painful and ineffective, and returned data to be lacking in coverage and in quality. Their questions remained unanswered; after GDPR they were still “in the dark”. There was widespread disappointment and a view that GDPR did not confer any power to the individual. | “It’s kind of disappointing because I would have hoped that this process would have levelled the user power versus the organisation power in a way that holds them accountable and [it doesn’t] seem to be doing that.” [P1] |
| Subtheme | Description | Quote |
|---|---|---|
| Data Formats and Usability | Participants anticipated receiving data in formats they could explore, visualise, mashup and play with, but in fact often received data that lacked explanations. Data was often arranged in ways that were more reflective of internal systems than being optimised for use or understanding. In some cases | “They did give me the data, but not how it fitted together. It’s like being given the bricks to a house, and then they’re like ‘Here’s your house’. It doesn’t really mean anything when it’s just bricks, if you don’t know how to put it together.” [P5] |
| The Search for Meaning and Value in Data | Participants found the large volumes of data that were sometimes returned overwhelming, and wanted summaries and breakdowns to understand it, as well as tools to help them make sense of and explore the (often technically formatted) data. Data that spanned a period of time was judged particularly meaningful as it could serve as a window into past memories and would allow for trends and changes over time to be observed. | “[It’s] almost too much […] for a normal person to be able to process and understand […] It could do with a document detailing, like, ‘this is what is in here’.” [P1] |
| The Practicality of Using-or Deleting–Your Data | Participants wanted to use returned data to better understand themselves, but given that returned data lacked visualisations and interpretations, they were unable to practically use data in this way. There was also a strong desire to delete held data, or restrict its use, though participants did not see a clear path to achieving this. | “[Companies did not] tell me what they are doing with [my data].. And sometimes I think my willingness to give a company data might be quite intrinsically linked with what they’re gonna do with it.” [P7] |
| Subtheme | Description | Quote |
|---|---|---|
| Power and Enforced Trust Through Data Holding | Participants feel that the sacrifice of (or the giving of permission to collect) personal data is a necessary cost in order to get the valued benefits of the services they want to use, something they are pressured to do and have no choice about. Such sacrifice is seen as the giving up of power, as participants lack access and control to that data. In the face of providers making decisions based on data and processes that they could not observe, participants felt powerless. This amassing of data was sometimes seen as surveillance, and some saw great potential for misuse and abuse of it. | “For me to have power over my data, I think is a fair and normal thing. But for a company to have power over [my] data means that it’s basically a proxy to have power over me.” [P8] |
| Accountability and Perceptions of Data Holders | Participants entered the study with varying impressions of providers, and wanted to assess data practices in order to hold them to account. Participants’ various observations reveal a strong link between their perceptions of providers’ data handling practices and the trust they hold in those same providers. | “When I like the company already, I’m more willing to give them my data.” [P2] |
| Changed Perspectives Through Scrutiny | In general, the more that participants found out about data-centric practices through the process of scrutinizing privacy policies and making data access requests, the more they distrusted providers. Failure to explain or provide complete data was harmful to trust. Conversely, where providers were more transparent or participants did obtain interesting data insights, trust was increased. | “If someone’s not completely open with you, then you’re like, ‘What are you hiding?’, which means you trust them less.” [P4] |
As Table 12 shows, in the vast majority (62%) of cases, participants wanted to see, know and understand what data was held about them and how it was used. For example, P11 wanted to know what data was collected by train company LNER when he bought tickets, so that he might judge whether it was appropriate:
“I’d be interested to understand what data they have […] Is it just the patterns of my spending on trains, or is it a bunch of other stuff that they’re using for advertising to me?”–P11
Beyond the data that participants had directly volunteered (see Table 9), most data was currently unknown to participants. In particular they wanted to gain awareness of what data might have been collected without their knowledge.
“The bit that concerns me is where I don’t know what data is being taken by companies. If I’m registering for a library or something, I know [what] data I’m giving to them, but what I don’t know is all the other stuff that they’re recording”–P9
Participants were equally unaware of what holders might infer from the data they had collected. P4 wondered if Philips could use data from his smart home lighting to deduce his sleep and TV-watching routines. P7 had received targeted advertisements relating to pregnancy that she felt weird about because she did not understand why she had been targeted in this way. P5 raised concern about how data inferences could affect decision making, surmising that the data holder had greater power than him because “they’re making decisions about things and you don’t know how they’re making those decisions”. Sharing of personal data is also insufficiently visible to participants; two participants (P3,P4) targeted GDPR requests to credit-check websites (Credit Karma,CheckMyFile) - P4 wanted to get “a picture of what other companies can currently expose”.
As detailed in 5.3.2, few requests resulted in a timely provision of requested data (44% or 68% depending whether referral to a download portal is excluded or included in the count). Many data holders responded late or not at all; such actions are objectively a breach of legislation. However, participants were broadly unsatisfied even when they did receive a GDPR response. In 100% of cases where data was obtained, it was considered incomplete, and this was usually seen as further failure to comply. Participants had reviewed their GDPR rights in Interview 1 (though, as expected [REF 90], most were already aware), and so several participants saw this apparent non-compliance relative to their understanding of their rights as a poor quality of response, for example:
“I feel more concerned now, […] what they’ve given me seemed reasonable. But then comparing against what we asked them for, what I’m legally [entitled to], it’s a fraction.”–P5
For some participants, sceptical from the start, such poor responses were consistent with their expectations; P6 found the incompleteness of Facebook’s response “alarmingly unsurprising”. Others had expected compliance:
“I am surprised at Google’s unwillingness to provide me with all of the data… they haven’t provided me with all of my data. And that’s not legal.”-—P7
Many participants, reflecting on a feeling of having less power than they had initially thought, felt that the prevalence of non-compliance showed that too much power relative to the authorities, that a lack of pressure is being applied by regulators and that “there needs to be more enforcement” (P11). P6 revised his view of Facebook’s power versus his own because he felt that after review he now could clearly see “which [data] they are prepared to share and which they aren’t”. P11 also framed the selectivity of responses as an exertion of power:
“It seems like there’s a lot of derived data about things like purchases and stuff [that I would expect] that just isn’t there. So they’re free to not give me the data. That, to me, suggests [that despite GDPR] they retain an awful lot of power.”-—P11
While in some 22% of cases participants did meet their goals through GDPR (see Table 12), when it came to the desire for greater awareness and understanding discussed in 5.4.2.1, this want was largely unmet. Volunteered data such as basic personal information or user-generated content was usually returned complete. This was often viewed as mundane and uninteresting, and the focus on these data types in returns was viewed as evasiveness. Facebook, P6 observed, “give you that kind of descriptive boring data which is mainly all publicly available anyway” and had omitted “the stuff that I would consider valuable to them”.
In general, the data responses did not provide the answers participants sought. Many reported “still” not knowing what they wanted to find out. P4 said he remained “in the dark”. P7 stated that “even though I did the process correctly, I still didn’t get that much back”. Concerns held by participants from the outset remained unaddressed, as in P11’s case:
“I still am quite concerned about how much data organisations have, particularly how they link that other data and how data is bought and sold, and I haven’t really got any answers on that.”
It was not just the data returned, but the process itself, that participants were dissatisfied with; requesting and achieving data access was time-consuming and difficult. “Jumping through hoops” was a phrase used independently by four different participants (P4, P5, P7 & P11) to describe the experience. Some found data holders obstructive and unhelpful:
“I feel like they give you a response that [makes it so] you cannot proceed intentionally”–P10
Participants recognised that they had received help and coaching, and that the processes were so tedious that others may not have persisted. P1 suggested that without the provided template, it would be “a lot harder to get meaningful data out”, and P7 attributed her sole successful request to the guidance she had received in progressing it. P5, having experienced problems with expiring links, delayed responses and missed emails, had been surprised at “how difficult it was just to get my data, and the fact that I had to ask them about six different times”.
Not all requests were this painful, some were handled smoothly. As P11 put it, “Some companies make it dead easy to get, but then the data is not massively useful.[…] Other companies make it a pain in the neck to get it.” Overall the view of GDPR data access was one of disappointment. Participants found GDPR ineffective: P10 said “Frankly, [GDPR] doesn’t have as much influence as I expected” and P1 commented that:
“It’s kind of disappointing, because I would have hoped that this process would have levelled the user power versus the organisation power in a way that holds them accountable and [it doesn’t] seem to be doing that.”
Prior to receiving data, participants had anticipated discovering insights about their own lives by browsing and reflecting on their personal data, consistent with personal informatics literature (Li, Dey and Forlizzi, 2010), however there was a comprehension gap between the useful information they imagined and the actual data returned; data was typically delivered as a bundle of technical files, which were hard to understand and often delivered without explanation. Some felt (echoing concepts of effective access described in 2.1.4) that they lacked the necessary skills or tools to make the data understandable or usable “for a non-techie person” (P11). When the researcher guided P7 to jsonlint.com, an online formatter, she found it more understandable. P2 made the point that data holders must be using tools themselves to make sense of people’s data: “they’re not just looking at a JSON file, so I would like to have the same visualisation [as them]”.
There was a sense in sending people individual data files, data had been removed from the environment in which it has meaning, and that the returned data excluded necessary context for interpretation. This was often seen in the form of internal codes and abbreviations that individuals could not understand. P4 stated of his experience looking at smart-lightbulb data from Philips Hue, that there was “just so much of it that it’s impossible to know [what it all means]… You’d have to spend a few hours going through this and being like, ‘OK, what does that line mean, and that symbol, and that code?’”. This lack of context also manifested as a failure to explain decision-making processes: P5 reflected, when looking at driving scores from a car insurer that uses a mobile app to monitor her driving, “I could see the data – it was the score that was weird for me. Like, it doesn’t tell you how it’s calculated.” P1 noticed that although some companies did make some effort to explain the returned data, this varied substantially across providers. He said that “it would be nice if these companies had a standardised model of how this information is presented to people, so it [could] be easily understood”.
One of the greatest obstacles to understanding that participants faced was being faced with a large volume of information and no means to quickly digest or navigate it; either very large files, or complex hierarchies of nested directories containing many separate files. It is clear that there is a need for summaries so that participants can quickly get a handle on what is - or is not - present. Returned data “could be valuable if you knew what the hell [was] in there” (P4). P1 described one of his data responses as “almost too much […] for a normal person to be able to process and understand.” He said that it “could do with a document detailing, like, ‘this is what is in here’”, and described the disparity across responses as being “either like death by thirst or death by drowning […] It would be better to drown, but still not ideal”. Ultimately is is clear that in general, returned data was not presented in a way that is optimised for understanding.
Another question that our findings were able to shed some light upon, in service of RQ1, was to consider what precisely makes data valuable to individuals. This is especially important given participants did identify the potential to gain personal benefits from their data (as seen in the second set of goals in Table 12). An idea that came up again and again was that data is most valuable when it spans a period of time and can be related to events in the individual’s life over that period. This could potentially provide new insights to participants. P2 for example hoped to see, or be able to construct, breakdowns and charts that would help hime xamine his food shopping habits. Through the GDPR process, P10 accessed details of her spending on micro-transactions in the mobile game Pokémon Go that had not been available to her through the app. P11 wanted to derive insights about his train travel by examining the geography, cost, journey length and patterns of his past journeys through data he hoped to receive from LNER. Long-duration data offers the potential ability to identify trends and changes in one’s own behaviour over time.
It was these historical parts of their data that participants found most meaningful, offering as it does a means of remembering, with data potentially serving as a “window into your past” (P11). P5 saw value in perusing music-listening data “just because it’s cool to look back on stuff that you’ve done and you don’t necessarily distinctly remember it”. Generally the longer period the data covered, the more valuable it was deemed to be:
“I would actually be interested in last.fm, partly because the data goes back to 2008 … Spotify only goes back about four or five years and not everything I listen to is on Spotify”.–P11
P6 saw the data accumulated by service providers as potentially forming part of a valuable background context to understanding life events in his past: “I would like to […] build a picture, not just like, ‘I remember going to Reykjavik’, but if there’s other data around that time [I could] sort of paint a biography of myself” and described some of his data as “a kind of personal history that has been quantified and sort of datafied”.
This personal value that captured data has the potential to offer shows that it is all the more important that participants be able to understand and make use of their data. Our participants found that the format in which data was returned often meant that it was not only difficult to understand, but difficult to use as well. Using data meant different things to different participants, with imagined uses including budgeting, record-keeping/archiving, or using the data for creative or fun purposes. Some participants (e.g. P5) saw value in potentially combining data from multiple sources, though this did not turn out to be practical. Participants did not know what data to expect, and generally imagined returned data being more useful than it turned out to be:
“I think … you could do some interesting mashups, but I don’t really know what with until I’ve got the data. It depends on the data; I’m sure there could be some cool uses of it.”–P4
Once data was received, participants struggled to interpret and understand it to a sufficient extent to be able to identify the useful data or meaningful information they had hoped for. Returned data formats and response structure were extremely varied. Some reported that there was not sufficient machine-readable data to make use of the data. For example, P4 received a Microsoft Word document full of pasted screenshots from an internal portal as part of his response from his ISP Virgin Media, and said that its usefulness “depends on what you want to get out of it, really. If you want to view the data they have about you, it’s quite usable. If you want to do something automated, then it’s not”. P11 found a similar returned screenshot from an internal system to be “completely non-understandable”. In other cases, the opposite problem occurred, with data being too technical for the participant to use. P10 said of JSON data: “For normal people who don’t understand programming, I feel it’s just, there’s no use at all.” P7 felt she lacked the technical proficiency to make use of the returned data:
“They have provided it in formats where I can see that, if I were a developer, I could do things with it, […] but if I was not that sort of person, it might be quite difficult to understand”–P7
In P5’s case, she saw the potential to use the data but felt that what was missing was additional explanation or guidance on how to access the valuable information within it:
“They did give me the data, but not how it fitted together. It’s like being given the bricks to a house, and then they’re like ‘Here’s your house’. It doesn’t really mean anything when it’s just bricks, if you don’t know how to put it together.”–P5
P11 highlighted a problem with his Tesco shopping data that was not just a matter of formatting or skill, but the granularity or focus of the data itself:
“As a technical person, having a CSV of data is quite useful, potentially, but actually what can I do with that if it’s Tesco’s internal systems data?”
While on the face of it the findings of 5.4.3.1 and 5.4.3.2, and the conflicting demands for both more technical and less technical data might seem contradictory, what we can infer is that participants collectively need both usable technical data and easy-to-read information summaries - and that those summaries should cover both the relatable life information within the data and the information about the data, what it means and how to use it; this idea is explored further in (Bowyer, 2021).
Having recognised that potential value of data relating to their lives, before or during this research, several participants were concerned about personal data being held. P10 for example said with reference to dating site Bumble: “Since I found my partner [and therefore no longer need a dating site] I deleted my account and I’ve been wondering, ‘Are they still keeping my data at the back?’” and with reference to both Instagram and Bumble, expressed a desire to have her data deleted and expected GDPR to play a role in the enforcement or verification of that deletion, something she could not otherwise be sure of. P8 considered the holding of sensitive data to be a liability that she was only willing to tolerate while she was actively using a service, and this was part of her motivation for targeting Natural Cycles:
“I now use a different one, but I used, for about a year [their] app to track my menstrual cycle. [It was my] main contraception method, so that’s things that this company probably has. Now that I’m not using it any more, I don’t know if they delete the things or not”
Many participants expressed a desire that data be held only for a short time, and questioned the default practice of data being kept beyond the period where it was needed to deliver a service:
“The thing that concerns me is that I haven’t used Tesco online for at least four or five years, so why are they hanging on to my IP address from five years ago?”–P11
He went on to spell out the liability he saw in such apparently mundane data being held, the liability coming from the duration of the data: “10 years of worth of shopping records… how much would that be worth to a health insurance company, and would they succumb to the temptation to sell that on?” P10, a Chinese citizen identified long-term sources of personal data as an enabler for future privacy violations, saying that “in China, [there is a trend] that as soon as someone becomes famous, people begin digging [through] all their past experiences”.
Most participants described the ability to delete or enforce the deletion of their data as having control over it, and given the current practical lack of such a capability felt that they had insufficient control over data holding. One of the first steps participants identified in gaining control of their data was simply an ability to see it, for accountability, so that they might check the accuracy, security and breadth of collected data and flag any unforeseen concerns. They felt that a deeper understanding might lead to an increased sense of individual safety and data control and facilitate them to make changes in data habits or choice of service provider:
“I want to understand how much they’re keeping. And what they’re doing with it. I’m hoping that by knowing that, I might change my behaviour about all the data I accidentally create.”–P7
In this participant’s case, this hope was unsatisfied, and upon looking back at her experience she remarked:
“I guess that’s one of my criticisms of GDPR in general - that although I can understand what data a company holds about me, there’s no obligation for them to tell me what they are doing with it.. And sometimes I think my willingness to give a company data might be quite intrinsically linked with what they’re gonna do with it.”–P7
In fact, that legal right does exist through GDPR, but as we can see it was not delivered in practice. The ability for participants to feel aware and in control of their data must begin with better data legibility and explanations of data use, accompanied by clear pathways to enable data correction or deletion.
Data Holders Enforce an Uneasy Trust
This study examined the GDPR’s effectiveness in improving individuals’ access and control over their personal data. The participants’ experiences support the established power imbalance (see section 2.1) and suggest GDPR largely fails to empower individuals: both objectively (to the extent possible by this limited sample), in that most companies do not comply fully (either by returning insufficient and inadequate data, or by failing to return data on time or at all), and subjectively, in that returned data was often difficult to understand, impractical for use, and raised new questions and concerns. The findings also indicate that swift, transparent, and easy-to-use GDPR procedures can positively impact an individual’s perception of an organisation. In light of these findings, this discussion offers insights on how the personal data landscape might be redesigned through policy (5.5.1) and business practice (5.5.2), and how individual action can have important impact too (5.5.3) – all in pursuit of the human-centric empowerment goals described in 5.1 as well as 2.2 and 2.3):
Despite significant and obvious investment in dashboards, processes and bespoke data package production, the findings (while limited by the small number of participants) indicate that inadequate compliance with the GDPR is common. The findings are consistent with literature too: the participants’ issues with completeness and compliance echo those first reported within the GDPR’s first year [REF 9], suggesting completeness and compliance have not improved over this period. However, the focus was on the effectiveness and experience of engaging with GDPR procedures from the individual’s perspective. Participants’ experiences were overwhelmingly of disappointment and frustration, with their hopes rarely met. They found that data holders often did not engage meaningfully with the process, and that the responses typically excluded or obscured data that could have provided them with the insights into their data privacy and the organisation’s data practices that they sought. Evaluations of perceived power compared to data holders largely remained the same or worsened after accessing data through GDPR, and participants were not confident in the capabilities of the legislation to shift the balance of power. The process was perceived by some as a “box-ticking exercise” that was both frustrating and time-consuming and did not ultimately help them. Even though in 7% of cases participants did feel empowered by the GDPR, all participants receiving data were in practice left with additional time-consuming and sometimes technically-skilled work to take advantage of or interpret their returned data. This suggests that to improve the situation, policymakers need to make changes towards:
1) Better Compliance Through Enforcement of Complaints. At present, enforcement of the GDPR is uneven; each country has its own DPA (for example in the UK, this is the Information Commissioner’s Office or ICO) and complaints are rarely pursued for individual cases. Instead, cases are processed by specific DPAs in a form similar to a class action lawsuit. This means that individuals have little impact when they do raise a complaint, and many GDPR complaints “become lost or resulted in lengthy delays” [REF 21], or may even be erroneously dropped [REF 72]. Until individuals have a clear and effective means to issue complaints [REF 11] that result in enforcement action (or a clear threat of it), it is likely that individuals will continue to have little recourse other than to repeat the request and hope similarly dissatisfied individuals will act on their behalf. Data holders must be held to account when they do not deliver the full set of data that they report possessing, or when they fail to do so within the legally obligated time frame.
2) Policies to Enforce Better Quality Responses. Many participants received data in frustrating formats, including screenshots, printouts or files that were too technical or littered with acronyms. Data was provided in formats too technical to understand, or not technical enough to be usable (see 5.4.3.1), showing a demand for both human-readable information summaries and machine-readable data files, where most providers typically provide only one or the other. Policymakers could provide suggested data formats or even propose new standards; this would help data portability, improve effectiveness [REF 44] and legibility [REF 80], can reduce costs through common tooling and catalyse the building of tools to interpret and understand data. Such standards are emerging [REF 78] as they are a technological necessity for data unification, but lack adoption.
3) Policies to Enforce Data Access as Ongoing Support, not One-Time Delivery. A radical redesign of policy is needed to give people the practical outcomes they desire and, according to the GDPR itself, deserve. Data access needs to be seen as more than “the delivery of data files”. People need understanding of their data and of its handling, and this is the measure by which compliance should be assessed. The explanations GDPR mandates are not forthcoming; of the 119 hopes expressed by participants (see Table 5), 70 (59%) related to acquiring greater understanding of data practices. 38 (54%) of these were unmet, and a further 15 (21%) were only partially met. By mandating data holders to support individuals with not just the delivery of data, but assistance to understand that data, policies could become more impactful, not least because such understanding is critical to inform judgements around consent, loyalty and compliance.
While this study, and the GDPR itself, might seem adversarial to data holders given the goal to reduce their power by imposing new procedures, the findings emphasise the role of personal data in consumer relations. Data holders are likely aware of the paramount role of personal data in decision-making, but may not be aware of individuals’ perceptions about this. The findings suggest that failure to satisfy users who are concerned about the collection and usage of their personal data risks harms to consumer trust and confidence, at least for those users, and perhaps for others they might influence. In turn, however, this presents opportunities to use the mechanisms of the GDPR for customer loyalty and building better relations. In 52% of cases, following the process of examining privacy policies and engaging in GDPR data requests resulted in a decrease in reported trust in the data holder. While such impacts may for now be minimal, as only a small proportion of users read privacy policies [REF 92] and–one can assume–an even smaller number conduct GDPR requests, this is likely to change as issues around data privacy and trust continue to take centre stage in global geopolitics [REF 98,REF 107]. Furthermore, the growing number of businesses focused on “getting your data” or “taking control” [REF 25,REF 40,REF 97,REF 116–118] suggest demand for data access is growing. From the findings, there are three three positive takeaways for data holders:
1) Data transparency is an opportunity to increase customer loyalty and trust. GDPR’s basic rights provide a starting point for delivering practical data transparency that will allow organisations to demonstrate that they are deserving of trust. By responding clearly and engaging openly and helpfully with GDPR data requests, organisations can demonstrate consistency between their privacy policy and their actions and demystify to their users the role that data holds in their business model. Research has shown that explanations can “ease humans’ interactions with technology […], help individuals understand a system’s function, justify system results, and increase their trust” [REF 42]. In 14% of cases, participants felt more trusting of the service brand as a result of their GDPR experience (sometimes even displacing prior apprehensiveness or distrust), citing reasons such as speedy, hassle-free responses, clear and understandable data, providers being upfront and open with data, and staff who exhibited a positive attitude to the request.
2) Data transparency is an opportunity for improved and re-imagined customer relations around data. Beyond the opportunity to improve trust, the mechanisms of data transparency suggested by the GDPR provide individuals with new capabilities for data curation and involvement. By offering individuals the ability to engage in empowering data interactions, data holders have the opportunity to improve engagement with their organisation and their services. If organisations view personal data as a shared resource to be curated and co-owned by the individuals that contributed it, there may be correspondingly shared benefits: for the individual, a sense of agency, influence and negotiability [80]; and for the service provider, an incentive for individuals to generate and share more data, an increased likelihood of individuals correcting inaccurate data, and more reliable and human-centric forms of ongoing consent closer to dynamic consent [61] than today’s ineffective models of informed consent [73].
3) New customer demands indicate untapped business opportunities. As the 500-member-strong MyData Global organization [REF 82] shows, there is growing demand for personal data empowerment. People’s personal data is splintered and trapped [REF 1,REF 16], and they cannot correlate data from different sources in order to reflect upon it, gain insights, and set goals [REF 70]. Due to commercial motivations, service providers generally deliver capabilities within a closed silo, not at the level of one’s wider environment [REF 2]. To be better empowered the individual could be the point of integration, the centre of their own Personal Data Ecosystem (PDE) [REF 81]. Life-level capabilities [REF 17] and the opportunities that well-designed and well-regulated GDPR-type regulations promise in this regard have not yet been exploited. Thorough, complete and timely data access in standard formats, as mentioned above, will be critical to enabling this vision. Growing companies such as CitizenMe [REF 119], Digi.Me [REF 38], Mydex [REF 125], ethi [REF 58], HestiaLabs [REF 34], udaptor [REF 97] and exist.io [REF 120] as well as larger organisations like BBC R&D [REF 13] and Microsoft [REF 75] are already starting to innovate in this space.
While participants experienced disappointment and frustration in their GDPR journeys, all participants gained new understandings; if not always of their data itself, at least of their target companies’ approach to data access requests. This new knowledge was sufficient to re-affirm or challenge existing attitudes or inform judgements–P1, for example, left Facebook after the study. Even an attempt to access data can be educational, and even a cursory look at a provider’s ‘What data do we collect’ privacy policy section can provide pause for thought. Today, individuals remain largely in the dark about the collection, use and sharing of their data through a combination of perceived complexity and effort combined with a lack of clear benefits. Table 12, alongside the increased control and insight promised by the PDE movement and platforms linked in 5.5.1 and 5.5.2 above, provide a glimpse of what the future may hold: a world where individuals take more control of their data and gain actionable self-insights. Three key messages for individuals can be inferred:
1) Your data is used to represent you and define your user experience. We hand over our data in exchange for access to services, but providers then use it (usually in aggregate) e.g. to inform product design or decide what content you see. This ‘innocent’ handover of data is in fact giving providers the means by which we are treated and – at times – controlled. Recognizing the crucial role of data (and our limited influence over it) is the first step to pursuing greater agency and control.
2) Your data contains meaningful and valuable data about your life. Data, as participants found, is dry and technical, but they all sought meaning and value within it (see 5.2.2). Within data lies potentially rich information about one’s life and past activity – some of which can even be inaccessible through any other means. This highlights both a risk (that others might gain this insight) and a potential benefit (that we could access this insight ourselves). In this context, data deletion without keeping a copy may be inadvisable. To access the value in data, individuals will need to demand data standards, better access and control mechanisms and insight tools.
3) Self-education and awareness enable accountability and informed choices. The findings highlight a lack of knowledge. Transparency is critical to judging ‘to what extent the bargain is fair’ [REF 66]. It is not always delivered, but GDPR makes it your right; a right that cannot be fully refused. Through challenging poor GDPR responses and demanding better information, individuals can have impact. Providers are ultimately motivated by public demand—one of the reasons download dashboards exist. Through the public pressure of negative attention, companies can be motivated to improve data access [REF 33]. With patience, GDPR rights can be exploited to force small changes.
Through a longitudinal study of 10 participants lasting three months, this case study has qualitatively, and to a lesser extent quantitatively, evaluated the human experience of using one’s GDPR access rights and of living with data-centric service provider relationships.
The findings, while not statistically representative, suggest that people currently lack awareness of held data and its uses by service providers. By guiding participants on a journey of discovery and careful scrutiny, encouraging them to draw their own conclusions about service providers on the basis of companies’ own promises, individuals’ legal rights, and participants’ own hopes (see Table 12), this research has shown that such a journey can be educational and enlightening with regard to increasing awareness, but also can seriously damage brand loyalty and trust in providers if comprehensive and well-explained data is not returned in a supportive and open manner (see 5.4.4).
The experience of GDPR seems to be an unsatisfactory one for individuals; participants were generally still ‘in the dark’. Serious problems with compliance have been highlighted (see 5.4.2.1): Participants received data that was incomplete, impractical for use, and they failed to acquire desired explanations. By its own aim to enhance individuals’ rights and control, the GDPR does not succeed. Participants continued to feel a lack of agency and choice, were largely unable to pursue goals such as data checking, correction or deletion, and their perceived sense of power within the provider relationship was largely unchanged by the experience. Nor does the GDPR allow individuals to adequately pursue their own goals related to accountability, self-reflection or creative data exploration (see 5.4.3.3). Individuals cannot be given power over their data through designing better Human-Data Interaction interfaces alone, but only through redesigned policies and business strategies that take into account the sociotechnical context [REF 12, REF 17].
In order to bring the human-centric ‘personal data ecosystem’ concept closer to reality, action must be taken to improve both compliance and quality of GDPR responses. Considering these findings, there is cause for radical policy reform, to move away from ‘data access as package delivery’ and to provide individuals a more effective and ongoing two-way window into their data (see 5.5.1), providing ongoing awareness, accountability, and negotiability. Data needs to be expressed to individuals in ways they can understand, as little to no practical impact is currently being achieved by delivery of a one-time snapshot of some technical files; in fact, we have shown such responses can be harmful to customers’ perceptions of the data holder in many cases.
For providers, the risk of reputational damage uncovered by this study should motivate them to engage meaningfully with data access requests; but such risk can be averted by redesigning both interfaces and processes to approach data access experiences as an opportunity to educate, and to build trust and loyalty, perhaps even through establishing progressive co-operative data stewardship relationships that truly involve the service user (see 5.5.2). While the GDPR experience is often disappointing and frustrating, it can provide insights that help individuals to challenge their assumptions, re-evaluate choices, and in some rare cases, feel empowered to act upon their data. Wider assertion of GDPR rights could demonstrate a desire for data holders to be transparent; without such visible demand, little may change (see 5.5.3).
Considering RQ1 (the pursuit of a a deeper understanding of people’s attitudes to everyday data holding and people’s wants from that data), this work suggests that people struggle to develop the meaningful relationship with their data that they desire because of the difficulties faced in seeing, accessing and understanding it. They are aware that within data is the potential for value to themselves, but cannot access that value, which in turn causes feelings of resignation, concern, distrust or suspicion towards data holders. What they seek most are two things: sufficient understanding to better judge the value exchange they have signed up for with providers (see goals in top half of Table 12), and good quality insights from data that would allow them to understand themselves better, learn from the past, set personal goals, and harness personal data for their individual benefit (see goals in lower half of Table 12). This duality of needs around data interaction is expanded upon in (Bowyer, 2021).
With respect to RQ2 (the pursuit of a better understanding of the role of that data in everyday service relationships), the findings suggest that personal data, held by providers, as in Case Study One, serves as a proxy for direct user involvement, and is treated as such. Once users have sacrificed their data, or given permission for its collection, they are rarely consulted and most services exclude them from seeing how that data will travel through the organisation and be used in decision-making; this is consistent with the ‘point of severance’ concept observed by Luger and Rodden (Luger and Rodden, 2013). As a result, the trust relationship between service provider and service user is extremely fragile, highly susceptible to subjective impressions of service brands, and as the findings show, discovery of poor data practices or a lack of transparency around data is sufficient to harm that relationship and in some cases even motivate individuals to change provider. As discomfort grows and scrutiny occurs, providers can expect customers to lose trust and loyalty. At the same time, this same data could play a central role in a re-invigorated relationship between a provider and a user, one based upon earned trust. It appears that providing easy, clear, data access and showing a willingness to respond to questions and explain data usage to users could be sufficient to allay concerns and instil strong customer loyalty. Of course, this assumes that the openness offered reveals practices the user finds agreeable, so perhaps this in some way explains why some companies that have more commercially-motivated approaches to personal data use (such as Facebook and Google) that many would find disagreeable upon examination, are apparently less willing to engage in transparency and user empowerment around data.
The general principles of earning trust through transparency, and rethinking data access as a means to involve users in decision-making, could be applied in a wide range of service endeavours that are currently very data-centric.
This is not a data chapter, it unifies the findings of Chapter 4 and Chapter 5 in the context of RQ1 and RQ2 to provide a common set of findings.
[Target 5,000 words]
“The world is working exactly as designed. And it’s not working very well. Which means we need to do a better job of designing it.” —Mike Monteiro, author of ‘Ruined by Design’
[target 300 words]
In this chapter, which kicks off the second part of this thesis, I build upon the newfound understanding of the better human-data relations that people need and start to consider how these goals might be achieved in practice. This second part of the thesis aims to answer the third sub-research question RQ3: ‘What challenges and opportunities exist for improving human data relations in practice?’. While the exploration of this question has also been informed throughout the PhD by other research activities including my work within the SILVER project (see 3.4.1.1 and 3.4.3.2) and my work on web augmentation (3.4.3.2), RQ3 is largely and substantively examined through my third PhD Case Study, introduced below, in which I was remotely embedded for three months within a full-time internship into the British Broadcasting Corporation (BBC)’s Research and Development department, working with specialists, designers, researchers and developers on an exploratory research project codenamed ‘Cornmarket’ during the summer of 2021. I continued this involvement as a part-time research consultant and critical friend for a further 5 months after the conclusion of the initial three-month placement.
In section 7.1 I….
[TODO Add in some back-reference to Discussion Part 1 - the intermediate understanding of human data relations needs that precedes this chapter and concludes part 1 - once that section has been written.]
As part of its Royal Charter, one of the BBC’s lesser known obligations is to maintain a ‘centre of excellence’ for research and development in broadcasting and electronic media[REF BBC Charter], and to this end it employs over 200 researchers in its R&D department looking at everything from AV engineering and production tools to new forms of media, virtual reality, digital wellbeing and human data interaction. The Cornmarket project, launched in 2019, is a BBC-internal human-data interaction research project which explores a possible role for the BBC as it moves beyond broadcast television, using its public service responsibility to guide citizens to a position of empowerment within today’s digital landscape - encompassing not just entertainment but health, finance and self-identity. Due to its unique funding from UK-wide TV licensing and its duties to not only entertain but to inform and educate the general public, the BBC is uniquely placed to take a more human-centred approach than commercial innovators in this space as it needs only to deliver value, not profit. The project is exploring the use of Solid[REF] technology to build a working Personal Data Store (PDS) prototype (see 2.3.4) while also developing, iterating and trialling user interface designs and conducting participatory research interviews and activities all to explore what for a BBC PDS might take and what features its potential users might value.
The proposed BBC PDS product would allow people to populate a PDS with personal data from APIs and data downloads from a variety of services including BBC iPlayer, Netflix, All4, Spotify, Instagram, Strava, Apple Health, banks and finance companies, as well as social media companies such as Facebook, LinkedIn and Twitter, and then to use these combined data sources to create personal “profiles” for Health, Finance, Media[Entertainment] and Core, within which various data insights, visualisations, capabilities would be delivered. One feature the work explores in depth as potentially valuable to users is the ability to include and exclude certain datapoints from the imported viewing history data in order to present a more accurate, curated view of oneself that could then be fed back to other applications such as BBC Sounds to give better content recommendations.
With a cross-disciplinary team of around 20 people including architects, developers, user experience designers, product designers, innovators, participatory researchers and marketers, and funding to outsource public engagement research to agencies, this project represents a significant player in the emerging personal data economy (see 2.3.4). As such the Cornmarket project is a fertile ground in which to learn more from practitioners in the PDE space and to test the learnings of this thesis in practice while also finding deeper insights in response to my research questions - in particular RQ3 which is concerned with the building of more human-centric personal data interfaces in practice. I took a three-month sabbatical from my PhD to join the project full-time as a Research Intern during the summer of 2020. Details of the work I carried out and participated in is presented in the next section. My involvement in the project can be seen as the conclusion of one of several action research cycles within my PhD (as detailed in 3.2 and Figure 3).
[target 900 words]
[target 4,000 words]
[target 4,000 words]
[target 600 words]
[Target x words]
Abbattista, F. et al. (2007) ‘Shaping personal information spaces from collaborative tagging systems’, in Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). (PART 3), pp. 728–735. doi: 10.1007/978-3-540-74829-8_89.
Abiteboul, S., André, B. and Kaplan, D. (2015) Managing your digital life with a Personal information management system. 5. ACM, pp. 32–35. doi: 10.1145/2670528.
‘About The Quantified Self’ (no date). Available at: https://quantifiedself.com/about/what-is-quantified-self/ (Accessed: 22 March 2021).
‘About Us’ (no date). datacy. Available at: https://datacy.com/personal-about (Accessed: 31 March 2021).
Abowd, G. D. (2012) ‘What next, ubicomp?: celebrating an intellectual disappearing act’, in Proceedings of the 2012 acm conference on ubiquitous computing. New York, New York, USA: ACM Press, pp. 31–40. doi: http://dx.doi.org/10.1145/2370216.2370222.
Abowd, G. D. et al. (1999) ‘Towards a better understanding of context and context-awareness’, in Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 304–307. doi: 10.1007/3-540-48157-5_29.
Abowd, G. D. and Mynatt, E. D. (2000) ‘Charting Past, Present, and Future Research in Ubiquitous Computing’, ACM Transactions on Computer-Human Interaction, 7(1), pp. 29–58. doi: 10.1145/344949.344988.
Ackoff, R. L. (1989) ‘From data to wisdom’, Journal of Applied Systems Analysis, 16(1), pp. 3–9.
Adams, R. (2017) ‘Michel Foucault: Discourse’. Available at: https://criticallegalthinking.com/2017/11/17/michel-foucault-discourse/ (Accessed: 7 May 2021).
Alizadeh, F. et al. (2019) ‘GDPR-reality check on the right to access data’, in ACM international conference proceeding series. New York, New York, USA: ACM Press, pp. 811–814. doi: 10.1145/3340764.3344913.
‘AllofMe Company Profile’ (2007). Available at: https://www.crunchbase.com/organization/allofme (Accessed: 23 March 2021).
‘AllofMe.com Teaser Clip’ (2008). YouTube. Available at: https://www.youtube.com/watch?v=JWyqt4WL6xE (Accessed: 21 March 2021).
Andrews, R. (2005) ‘GTD : A New Cult for the Info Age’, Wired. Available at: https://www.wired.com/2005/07/gtd-a-new-cult-for-the-info-age/.
Apple (2009) ‘iPhone 3G Commercial: "There’s an app for that"’. YouTube. Available at: https://www.youtube.com/watch?v=mFlITzqRBWY.
Arfelt, E., Basin, D. and Debois, S. (2019) ‘Monitoring the GDPR’, in Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 681–699. doi: 10.1007/978-3-030-29959-0_33.
Aslam, H. et al. (2016) ‘Harnessing Smartphones as a Personal Informatics Tool towards Self-Awareness and Behavior Improvement’, Proceedings - 2016 IEEE 14th International Conference on Dependable, Autonomic and Secure Computing, DASC 2016, 2016 IEEE 14th International Conference on Pervasive Intelligence and Computing, PICom 2016, 2016 IEEE 2nd International Conference on Big Data. IEEE, pp. 467–474. doi: 10.1109/DASC-PICom-DataCom-CyberSciTec.2016.92.
Ausloos, J. (2019) ‘GDPR Transparency as a Research Method’, SSRN Electronic Journal, (May), pp. 1–23. doi: 10.2139/ssrn.3465680.
Ausloos, J. and Dewitte, P. (2018) Shattering one-way mirrors-data subject access rights in practice. Available at: www.irissproject.eu https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3106632.
Ausloos, J. and Veale, M. (2020) ‘Researching with Data Rights’, Technology and Regulation, pp. 136–157.
Bannon, L. J. (1995) ‘From Human Factors to Human Actors: The Role of Psychology and Human-Computer Interaction Studies in System Design’, Readings in Human–Computer Interaction, pp. 205–214. doi: 10.1016/b978-0-08-051574-8.50024-8.
Barbosa Neves, B. and Casimiro, C. (2018) Connecting Families?: Information & Communication Technologies, Generations, and the Life Course. Policy Press.
Barreau, D. K. (1995) ‘Context as a factor in personal information management systems’, Journal of the American Society for Information Science, 46(5), pp. 327–339. doi: 10.1002/(SICI)1097-4571(199506)46:5<327::AID-ASI4>3.0.CO;2-C.
Barreau, D. and Nardi, B. A. (1995) ‘Finding and reminding’, ACM SIGCHI Bulletin, 27(3), pp. 39–43. doi: 10.1145/221296.221307.
Bate, A. and Bellis, A. (2018) The Troubled Families programme ( England ). July.
Battarbee, K. and Koskinen, I. (2005) ‘Co-experience: user experience as interaction’, CoDesign. Taylor & Francis, 1(1), pp. 5–18.
BBC R&D (2017) ‘Human Data Interaction - BBC R&D’. Available at: https://www.bbc.co.uk/rd/projects/human-data-interaction.
Beck, K. et al. (2001) ‘The Agile Manifesto’. Available at: http://agilemanifesto.org/.
Bell, G. and Gemmell, J. (2009) Total recall: how the E-memory revolution will change everything. Dutton (09), pp. 47–5062–47–5062. doi: 10.5860/choice.47-5062.
Bergman, O. (2013) ‘The Effect of Folder Structure on Personal File Navigation’, Journal of the American Society for Information Science and Technology, 64(July), pp. 1852–1863. doi: 10.1002/asi.
Bergman, O., Beyth-Marom, R. and Nachmias, R. (2003) ‘The user-subjective approach to personal information management systems’, Journal of the American Society for Information Science and Technology, 54(9), pp. 872–878. doi: 10.1002/asi.10283.
Bergman, O. et al. (2008) ‘Improved search engines and navigation preference in personal information management’, ACM Transactions on Information Systems, 26(4). doi: 10.1145/1402256.1402259.
Bergman, O. et al. (2012) ‘How do we find personal files?: The effect of OS, presentation & depth on file navigation’, Conference on Human Factors in Computing Systems - Proceedings, pp. 2977–2980. doi: 10.1145/2207676.2208707.
Bjerknes, G. et al. (1987) Computers and democracy : a Scandinavian challenge. Aldershot [Hants, England]; Brookfield [Vt.], USA: Avebury, p. 434. Available at: http://www.worldcat.org/title/computers-and-democracy-a-scandinavian-challenge/oclc/614994092?referer=di&ht=edition.
Björgvinsson, E., Ehn, P. and Hillgren, P.-A. (2010) ‘Participatory design and" democratizing innovation"’, in Proceedings of the 11th biennial participatory design conference, pp. 41–50.
Boud, D., Keogh, R. and Walker, D. (1985) Reflection: Turning experience into learning. Routledge.
Bowker, G. C. et al. (2015) Boundary objects and beyond : working with Leigh Star. MIT Press, p. 548. Available at: https://books.google.co.uk/books?hl=en&lr=&id=nmSkCwAAQBAJ&oi=fnd&pg=PR5&dq=Boundary+Objects+and+Beyond:+Working+with+Leigh+Star&ots=blmnW7yz4u&sig=F08uGeG_lT_klhhR64M18tQNI1s#v=onepage&q=Boundary Objects and Beyond%3A Working with Leigh Star&f=false.
Bowyer, A. (2011) ‘Why files need to die’. Available at: http://radar.oreilly.com/2011/07/why-files-need-to-die.html.
Bowyer, A. (2018) ‘Free Data Interfaces: Taking Human- Data Interaction to the Next Level’, CHI Workshops 2018. Available at: https://eprints.ncl.ac.uk/273825.
Bowyer, A. (2021) ‘Human-Data Interaction has two purposes: Personal Data Control and Life Information Exploration’. Available at: https://eprints.ncl.ac.uk/273832#.
Bowyer, A. et al. (2018) ‘Understanding the Family Perspective on the Storage, Sharing and Handling of Family Civic Data’, in Conference on human factors in computing systems - proceedings. New York, New York, USA: ACM Press, pp. 1–13. doi: 10.1145/3173574.3173710.
Bowyer, A. et al. (2019) ‘Human-data interaction in the context of care: Co-designing family civic data interfaces and practices’, in Conference on human factors in computing systems - proceedings. doi: 10.1145/3290607.3312998.
Brandt, E. and Messeter, J. (2004) ‘Facilitating collaboration through design games’, in Proceedings of the eighth conference on participatory design artful integration: Interweaving media, materials and practices - pdc 04. New York, New York, USA: ACM Press, p. 121. doi: 10.1145/1011870.1011885.
Braun, V. and Clarke, V. (2006) ‘Using thematic analysis in psychology’, Qualitative Research in Psychology. Taylor & Francis, 3(2), pp. 77–101. doi: 10.1191/1478088706qp063oa.
Bridle, J. (2016) ‘Algorithmic Citizenship, Digital Statelessness’, GeoHumanities. James Bridle, 2(2), pp. 377–381. doi: 10.1080/2373566x.2016.1237858.
Brooks, D. (2013) ‘The Philosophy of Data’. Available at: https://www.nytimes.com/2013/02/05/opinion/brooks-the-philosophy-of-data.html.
Brown, D. (2015) ‘Here’s what ‘fail fast’ really means’. Available at: https://venturebeat.com/2015/03/15/heres-what-fail-fast-really-means/.
Brynjolfsson, E. and Oh, J. H. (2012) ‘The attention economy: Measuring the value of free digital services on the internet’, International Conference on Information Systems, ICIS 2012, 4, pp. 3243–3261.
Bufalieri, L. et al. (2020) ‘GDPR: When the right to access personal data becomes a threat’. doi: 10.1109/icws49710.2020.00017.
Bunge, M. (1999) Social Science Under Debate: A Philosophical Perspective. University of Toronto Press. Available at: https://books.google.co.uk/books?id=-MLjZzJLbpkC.
Burkeman, O. (2011) ‘SXSW 2011: The internet is over’. Available at: https://www.theguardian.com/technology/2011/mar/15/sxsw-2011-internet-online (Accessed: 23 March 2021).
Bush, V. (1945) ‘As we may think’, The Atlantic Monthly, 3(2), pp. 35–46. doi: 10.1145/227181.227186.
Bødker, S. (2006) ‘When second wave HCI meets third wave challenges’, ACM International Conference Proceeding Series, 189(October), pp. 1–8. doi: 10.1145/1182475.1182476.
Bødker, S. (2015) ‘Third-wave HCI, 10 years later—participation and sharing’, Interactions, 22(5), pp. 24–31. doi: 10.1145/2804405.
Campbell, P. L. (2011) ‘Peirce, pragmatism, and the right way of thinking’, Sandia National Laboratories, Albuquerque. Citeseer.
Carter, J. (2015) ‘Who are the digital disruptors redefining entire industries?’ Available at: https://www.techradar.com/uk/news/world-of-tech/who-are-the-digital-disruptors-redefining-entire-industries-1298171 (Accessed: 23 March 2021).
Caruthers, M. (2018) ‘World Password Day: How to Improve Your Passwords’. Available at: https://blog.dashlane.com/world-password-day/ (Accessed: 5 May 2021).
Cavoukian, A. (2010) ‘Privacy by design: the definitive workshop. A foreword by Ann Cavoukian, Ph.D’, Identity in the Information Society, 3(2), pp. 247–251. doi: 10.1007/s12394-010-0062-y.
Cavoukian, A. (2012) ‘Privacy by Design and the Emerging Personal Data Ecosystem’, (October), pp. 1–39.
Chang, A. (2018) ‘The Facebook and Cambridge Analytica scandal, explained with a simple diagram - Vox’. Available at: https://www.vox.com/policy-and-politics/2018/3/23/17151916/facebook-cambridge-analytica-trump-diagram.
Cheetham, M. et al. (2018) ‘Embedded research: A promising way to create evidence-informed impact in public health?’, Journal of Public Health (United Kingdom). Oxford University Press, 40(suppl_1), pp. i64–i70. doi: 10.1093/pubmed/fdx125.
Chevalier, J. M. and Buckles, D. J. (2008) SAS2: A guide to collaborative inquiry and social engagement. SAGE Publishing India.
Chevalier, J. M. and Buckles, D. J. (2019) Participatory action research: Theory and methods for engaged inquiry. Routledge.
Choe, E. K. et al. (2014) ‘Understanding quantified-selfers’ practices in collecting and exploring personal data’, in Proceedings of the 32nd annual acm conference on human factors in computing systems - chi ’14. New York, New York, USA: ACM Press, pp. 1143–1152. doi: 10.1145/2556288.2557372.
Chung, C. F. et al. (2016) ‘Boundary negotiating artifacts in personal informatics: Patient-provider collaboration with patient-generated data’, Proceedings of the ACM Conference on Computer Supported Cooperative Work, CSCW, 27, pp. 770–786. doi: 10.1145/2818048.2819926.
Clarke, N. et al. (2019) ‘GDPR: an impediment to research?’, Irish Journal of Medical Science (1971-). Springer, 188(4), pp. 1129–1135.
Cogran, P. and Kinsley, S. (2012) ‘Paying Attention: towards a critique of the attention economy’, Cultural Machine, 13.
Comandè, G. and Schneider, G. (2021) ‘Can the GDPR make data flow for research easier? Yes it can, by differentiating! A careful reading of the GDPR shows how EU data protection law leaves open some significant flexibilities for data protection-sound research activities’, Computer Law & Security Review. Elsevier, 41, p. 105539.
Connected Health Cities (2017) ‘SILVER Project: Smart Interventions for Local Residents’. Available at: https://www.connectedhealthcities.org/research-projects/troubled-families/ (Accessed: 14 May 2021).
Copeland, E. (2015) Small Pieces Loosely Joined: How smarter use of technology and data can deliver real reform of local government. Available at: www.policyexchange.org.uk https://policyexchange.org.uk/publication/small-pieces-loosely-joined-how-smarter-use-of-technology-and-data-can-deliver-real-reform-of-local-government/.
Cornford, J., Baines, S. and Wilson, R. (2013) ‘Representing the family: how does the state ’think family’?’, Policy & Politics, 41(1), pp. 1–19. doi: 10.1332/030557312X645838.
Corra, M. and Willer, D. (2002) ‘The gatekeeper’, Sociological Theory. SAGE Publications Sage CA: Los Angeles, CA, 20(2), pp. 180–207.
Coughlan, T. et al. (2013) ‘Methods for studying technology in the home’, in CHI’13 extended abstracts on human factors in computing systems, pp. 3207–3210.
Coughlan, T. et al. (2013) ‘Current issues and future directions in methods for studying technology in the home’, PsychNology Journal, 11(2), pp. 159–184.
Council of the European Union (2015) ‘Proposal for a Regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation)’. Brussels. Available at: http://data.consilium.europa.eu/doc/document/ST-9565-2015-INIT/en/pdf.
Crabtree, A. and Mortier, R. (2016) ‘Personal Data, Privacy and the Internet of Things: The Shifting Locus of Agency and Control’, SSRN Electronic Journal, pp. 1–20. doi: 10.2139/ssrn.2874312.
Crabtree, A. and Tolmie, P. (2018) ‘The practical politics of sharing personal data’, in Personal and Ubiquitous Computing. Springer-Verlag (2), pp. 293–315. doi: 10.1007/s00779-017-1071-8.
Crivellaro, C. et al. (2019) ‘Not-equal: Democratizing research in digital innovation for social justice’, Interactions, 26(2), pp. 70–73. doi: 10.1145/3301655.
Croll, A. (2009) ‘The Three Economies of Online Currency’. Available at: https://solveforinteresting.com/the-three-currencies-of-the-online-economy/.
Ctrl-Shift (2014) ‘Personal Information Management Services: An analysis of an emerging market’. Nesta, p. 38. Available at: https://www.nesta.org.uk/report/personal-information-management-services-an-analysis-of-an-emerging-market/.
‘Data’ (no date). Grammarist. Available at: https://grammarist.com/usage/data/.
Decker, S. and Frank, M. (2004) ‘The Networked Semantic Desktop’, WWW Workshop on Application Design, Development and Implementation Issues in the Semantic Web. doi: 10.1108/eb057368.
‘Delicious’ (2003). Available at: https://en.wikipedia.org/wiki/Delicious_(website).
Department for Education (2018) Working Together to Safeguard Children. March, p. 393. doi: 10.1080/13561820020003919.
Design Council UK (2004) ‘What is the framework for innovation? Design Council’s evolved Double Diamond’. Available at: https://www.designcouncil.org.uk/news-opinion/what-framework-innovation-design-councils-evolved-double-diamond (Accessed: 20 May 2021).
Dewey, J. (1938) ‘Experience and education’.
Dewey, J. and Archambault, R. D. (1964) ‘John Dewey on education: Selected writings’.
Dey, A. K. (2000) Providing Architectural Support for Building Context-Aware Applications. PhD thesis.
Dey, A. K. (2001) ‘Understanding and using context’, Personal and ubiquitous computing, pp. 4–7. Available at: http://dl.acm.org/citation.cfm?id=593572.
Dijck, J. van (2014) ‘Datafication, dataism and dataveillance: Big data between scientific paradigm and ideology’, Surveillance and Society. Surveillance Studies Network, 12(2), pp. 197–208. doi: 10.24908/ss.v12i2.4776.
DiSalvo, C. (2010) ‘Design, Democracy and Agonistic Pluralism’, Proceedings of the Design Research Society Conference 2010, pp. 366–371.
DiSalvo, C. (2012) Adversarial Design. MIT Press (Design thinking, design theory). doi: 10.7551/mitpress/8732.003.0007.
Dourish, P. (2001) Where the action is: the foundations of embodied interaction. MIT press.
Dourish, P. (2003) ‘The appropriation of interactive technologies: Some Lessons From Placeless Documents’, Computer Supported Cooperative Work, 12(4), pp. 465–490.
Dourish, P. (2004) ‘What we talk about when we talk about context’, Personal and Ubiquitous Computing, 8(1), pp. 19–30. doi: 10.1007/s00779-003-0253-8.
Dourish, P. et al. (2000) ‘Extending document management systems with user-specific active properties’, ACM Transactions on Information Systems, 18(2), pp. 140–170. doi: 10.1145/348751.348758.
Eliasson, J., Cerratto Pargman, T. and Ramberg, R. (2009) ‘Embodied interaction or context-aware computing? An integrated approach to design’, in Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Springer, Berlin, Heidelberg (PART 1), pp. 606–615. doi: 10.1007/978-3-642-02574-7_68.
Engelbart, D. C. (1962) ‘Augmenting human intellect: A conceptual framework’. Menlo Park, CA, USA: Stanford Research Institute.
Etzel, B. (1995) ‘New strategy and techniques to cope with information overload’, in IEE colloquium on information overload. IEE (223), pp. 2–2. doi: 10.1049/ic:19951427.
European Commission (2014) Research and Innovation in the field of ICT for Health, Wellbeing and Ageing Well: an overview, p. 39.
European Union Agency for Fundamental Rights (2020) ‘Your Rights Matter: Data Protection and Privacy 2020’, p. 20. doi: 10.2811/031862.
‘Facebook–Cambridge Analytica Data Scandal’ (2014). Available at: https://en.wikipedia.org/wiki/Facebook–Cambridge_Analytica_data_scandal.
‘Facebook - Data Policy’ (no date). Available at: https://www.facebook.com/about/privacy (Accessed: 9 August 2021).
Feng, Y. and Agosto, D. E. (2019) ‘Revisiting personal information management through information practices with activity tracking technology’, Journal of the Association for Information Science and Technology, 70(12), pp. 1352–1367. doi: 10.1002/asi.24253.
Field, F. (2010) The Foundation Years: preventing poor children becoming poor adults. Available at: www.frankfield.co.uk http://www.inspiredbybabies.org.uk/Page2NationalrelevantDocsresources/Frank Field Preventing poor children becoming poor adults 2011.pdf.
‘Finland: Broadband Access Made Legal Right In Landmark Law’ (2010). Available at: https://www.huffpost.com/entry/finland-broadband-access_n_320481 (Accessed: 23 March 2021).
Firth, E. (2019) ‘Personal data has value in so many different ways’. digi.me. Available at: https://blog.digi.me/2019/09/04/personal-data-has-so-much-more-value-than-pure-cash/.
Foulonneau, M. and Riley, J. (2008) Metadata for digital resources : implementation, systems design and interoperability. Chandos Pub, p. 203.
Fowler, M. and Highsmith, J. (2001) ‘The agile manifesto’, Software Development. [San Francisco, CA: Miller Freeman, Inc., 1993-, 9(8), pp. 28–35.
Freeman, E. and Gelernter, D. (1996) ‘Lifestreams: A Storage Model for Personal Data’, SIGMOD Record (ACM Special Interest Group on Management of Data). Association for Computing Machinery (ACM), 25(1), pp. 80–86. doi: 10.1145/381854.381893.
Friedman, B. and Hendry, D. G. (2019) Value Sensitive Design: Shaping Technology with Moral Imagination. MIT Press (The mit press). Available at: https://books.google.co.uk/books?id=8ZiWDwAAQBAJ.
Friedman, R. L. (2006) ‘Deweyan Pragmatism’, William James Studies, 1. Available at: https://williamjamesstudies.org/deweyan-pragmatism/.
Frost, A. (2019) ‘Forget Folders: The Best Ways to Organize Your Files with Tags and Labels’. Available at: https://zapier.com/blog/how-to-use-tags-and-labels/.
Fu, S. et al. (2020) ‘Social media overload, exhaustion, and use discontinuance: Examining the effects of information overload, system feature overload, and social overload’, Information Processing and Management, 57(6). doi: 10.1016/j.ipm.2020.102307.
Gelernter, D. (1994) ‘The cyber-road not taken: Lost on the info-highway? Here’s some stuff that could really change your life.’, The Washington Post, 3.
Gellman, B. (2013) ‘Edward Snowden, after months of NSA revelations, says his mission’s accomplished’, The Washington Post, 23. Available at: http://www.washingtonpost.com/world/national-security/edward-snowden-after-months-of-nsa-revelations-says-his-missions-accomplished/2013/12/23/49fc36de-6c1c-11e3-a523-fe73f0ff6b8d_story.html%5Cnhttp://www.washingtonpost.com/world/national-security/edward-.
Gemmell, J., Bell, G. and Lueder, R. (2006) ‘MyLifeBits: A personal database for everything’, Communications of the ACM, 49(1), pp. 88–95. doi: 10.1145/1107458.1107460.
Gillespie, T. and Seaver, N. (2016) ‘Critical Algorithm Studies - A Reading List’. Available at: https://socialmediacollective.org/reading-lists/critical-algorithm-studies/.
Gitelman, L. (2013) Raw data is an oxymoron. Edited by Lisa Gitelman. MIT Press, p. 182. Available at: https://mitpress.mit.edu/books/raw-data-oxymoron.
Glavic, B. et al. (2021) ‘Trends in Explanations: Understanding and Debugging Data-driven Systems’, Foundations and Trendsin Databases. Now Publishers, Inc., 11(3), pp. 226–318. doi: 10.1561/XXXXXXXXX.Boris.
Golembewski, M. and Selby, M. (2010) ‘Ideation decks’, in Proceedings of the 8th acm conference on designing interactive systems - dis ’10. New York, New York, USA: ACM Press, p. 89. doi: 10.1145/1858171.1858189.
Gonscherowski, S. and Bieker, F. (2018) ‘Who You Gonna Call When There’s Something Wrong in Your Processing? Risk Assessment and Data Breach Notifications in Practice’, in IFIP international summer school on privacy and identity management. Springer, pp. 35–50.
‘Google Desktop Search’ (2004). Available at: https://en.wikipedia.org/wiki/Google_Desktop.
Guba, E. G. (1990) ‘The alternative paradigm dialog’, The paradigm dialog. Sage Publications, Inc, pp. 17–30. Available at: http://www.jstor.org/stable/3340973.
Gurstein, M. (2003) ‘Effective use: A community informatics strategy beyond the digital divide’, First Monday, 8(12). doi: 10.5210/fm.v0i0.1798.
Gurstein, M. B. (2011) ‘Open data: Empowering the empowered or effective data use for everyone?’, First Monday. First Monday, 16(2). doi: 10.5210/fm.v16i2.3316.
Hamon, R. et al. (2021) ‘Impossible Explanations? Beyond explainable AI in the GDPR from a COVID-19 use case scenario’, in Proceedings of the 2021 acm conference on fairness, accountability, and transparency, pp. 549–559.
Harbird, R. (2006) ‘Novel Applications for Information Technology in Risk Assessment for Children’s Social Care in the UK’, Rn. Available at: http://www.cs.ucl.ac.uk/research/researchnotes/documents/RN_06_11.pdf.
Harris, T. (2013a) ‘A Call to Minimize Distraction Respect Users’ Attention’. Available at: http://www.minimizedistraction.com/.
Harris, T. (2013b) ‘Who We Are: Center for Humane Technology (CHT)’. Available at: https://www.humanetech.com/who-we-are.
Harris, T. (2016) ‘How Technology Hijacks People’s Minds — from a Magician and Google’s Design Ethicist’. Available at: https://www.tristanharris.com/2016/05/how-technology-hijacks-peoples-minds - from-a-magician-and-googles-design-ethicist/ (Accessed: 22 March 2019).
Hart-Davidson, W., Zachry, M. and Spinuzzi, C. (2012) ‘Activity streams: Building context to coordinate writing activity in collaborative teams’, in SIGDOC’12 - proceedings of the 30th acm international conference on design of communication. New York, New York, USA: ACM Press, pp. 279–287. doi: 10.1145/2379057.2379109.
Hayes, G. R. (2011) ‘The relationship of action research to human-computer interaction’, ACM Transactions on Computer-Human Interaction, 18(3), pp. 1–20. doi: 10.1145/1993060.1993065.
‘HDI Lab, Heerlen’ (2020). Available at: https://hdilab.com/.
‘HDI Network Plus, University of Glasgow’ (2018). Available at: https://hdi-network.org/.
Hemp, P. (2009) ‘Death by Information Overload’. Available at: https://hbr.org/2009/09/death-by-information-overload (Accessed: 23 March 2021).
Henderson, I. and Group, B.-s. W. (2020) ‘Customer — Supplier Engagement Framework Explained’, pp. 1–7. Available at: https://me2ba.org/wp-content/uploads/2020/09/customer-supplier-engagement-framework-updated-9-28.pdf.
Hendler, J. and Berners-Lee, T. (2010) ‘From the Semantic Web to social machines: A research challenge for AI on the World Wide Web’. doi: 10.1016/j.artint.2009.11.010.
Herselman, M. et al. (2016) ‘A Digital Health Innovation Ecosystem for South Africa’, in 2016 ist-africa conference, ist-africa 2016. doi: 10.1109/ISTAFRICA.2016.7530615.
Hixon, J. G. and Swann, W. B. (1993) ‘When Does Introspection Bear Fruit? Self-Reflection, Self-Insight, and Interpersonal Choices’, Journal of Personality and Social Psychology, 64(1), pp. 35–43. doi: 10.1037/0022-3514.64.1.35.
Hoffman, W. (2010) ‘Rethinking Personal Data’. Available at: https://web.archive.org/web/20110220013300/http://www.weforum.org/issues/rethinking-personal-data.
Hoffman, W. (2011) Personal data : The emergence of a new asset class. World Economic Forum, pp. 1–40. Available at: http://www.weforum.org/reports/personal-data-emergence-new-asset-class.
Hoffman, W. (2013) Unlocking the Value of Personal Data: From Collection to Usage Prepared in collaboration with The Boston Consulting Group Industry Agenda. February. World Economic Forum.
Hoffman, W. (2014a) Rethinking Personal Data : A New Lens for Strengthening Trust. May. World Economic Forum, p. 35. Available at: http://www3.weforum.org/docs/WEF_RethinkingPersonalData_ANewLens_Report_2014.pdf.
Hoffman, W. (2014b) Rethinking personal data: Trust and context in user-centred data ecosystems. May. World Economic Forum, p. 35. Available at: http://www3.weforum.org/docs/WEF_RethinkingPersonalData_TrustandContext_Report_2014.pdf.
Honeyman, M., Dunn, P. and Mckenna, H. (2016) A digital NHS?
Hoofnagle, C. J., Sloot, B. van der and Borgesius, F. Z. (2019) ‘The European Union general data protection regulation: What it is and what it means’, Information and Communications Technology Law. Taylor & Francis, 28(1), pp. 65–98. doi: 10.1080/13600834.2019.1573501.
Hosch, W. L. (2017) ‘Web 2.0’. Available at: https://www.britannica.com/topic/Web-20 (Accessed: 26 April 2021).
Hotho, A., Nürnberger, A. and Paaß, G. (2005) ‘A brief survey of text mining.’, in Ldv forum. Citeseer (1), pp. 19–62.
Huberman, M. and Miles, M. B. (2002) The qualitative researcher’s companion. Sage.
Human, S. and Cech, F. (2021) ‘A human-centric perspective on digital consenting: The case of GAFAM’, Smart Innovation, Systems and Technologies, 189, pp. 139–159. doi: 10.1007/978-981-15-5784-2_12.
‘Human Data Interaction Project at the Data to AI Lab, MIT’ (2015). Available at: https://hdi-dai.lids.mit.edu/.
Hutton, D. M. (2012) ‘Turing’s Cathedral: The Origins of the Digital Universe’. Emerald Group Publishing Limited.
Hwang, E. (2021) ‘Sketching Dialogue : Incorporating Sketching in Emphatic Semi-structured Interviews for HCI’.
‘Information’ (no date). Available at: https://en.wikipedia.org/wiki/Information.
Information Commissioner’s Office (2014) ‘Data controllers and data processors: what the difference is and what the governance implications are’, p. 20. Available at: https://ico.org.uk/for-organisations/guide-to-data-protection/introduction-to-data-protection/some-basic-concepts/.
Information Commissioner’s Office (2018) ‘Your data matters - Your rights’. Available at: https://ico.org.uk/your-data-matters/.
Information Commissioner’s Office (2021a) ‘Your right of access’. Available at: https://ico.org.uk/your-data-matters/your-right-to-get-copies-of-your-data/ (Accessed: 23 August 2021).
Information Commissioner’s Office (2021b) ‘Your right to data portability’.
‘Infovark Company Profile’ (2007). Available at: https://www.crunchbase.com/organization/infovark.
Jelly, M. (2021) ‘The Mission’. ethi.me. Available at: https://www.ethi.me/the-mission (Accessed: 31 March 2021).
Jenkins, H. (2006) Convergence Culture: Where Old and New Media Collide. New York, USA: New York University Press. doi: 10.7551/mitpress/9780262036016.003.0012.
Jilek, C. et al. (2018) ‘Context spaces as the cornerstone of a near-transparent and self-reorganizing semantic desktop’, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11155 LNCS, pp. 89–94. doi: 10.1007/978-3-319-98192-5_17.
Johnson, S. L., Kim, Y. M. and Church, K. (2010) ‘Towards client-centered counseling: Development and testing of the WHO Decision-Making Tool’, Patient Education and Counseling. Elsevier Ireland Ltd, 81(3), pp. 355–361. doi: 10.1016/j.pec.2010.10.011.
Jones, T. (2011) ‘Designing for second screens : The Autumnwatch Companion’. Available at: https://www.bbc.co.uk/blogs/researchanddevelopment/2011/04/the-autumnwatch-companion---de.shtml.
Jones, W. (2011a) ‘The Future of Personal Information Management Part I: Our Information, Always and Forever’.
Jones, W. (2011b) ‘The Future of Personal Information Management Part I: Our Information, Always and Forever’, p. 72.
Kalvet, T. (2005) ‘Digital divide and the ICT paradigm generally and in estonia’, in Encyclopedia of developing regional communities with information and communication technology. IGI Global, pp. 182–187. doi: 10.4018/978-1-59140-575-7.ch032.
Karger, D. R. et al. (2005) ‘Haystack: A customizable general-purpose information management tool for end users of semistructured data’, in 2nd biennial conference on innovative data systems research, cidr 2005, pp. 13–27. Available at: https://s3.amazonaws.com/academia.edu.documents/46870765/haystack.pdf.
Karger, D. R. and Jones, W. (2006) ‘Data unification in personal information management’, Communications of the ACM, 49(1), p. 77. doi: 10.1145/1107458.1107496.
Kasirzadeh, A. and Clifford, D. (2021) Fairness and Data Protection Impact Assessments. Association for Computing Machinery (1), pp. 146–153. doi: 10.1145/3461702.3462528.
Kaye, J. et al. (2015) ‘Dynamic consent: a patient interface for twenty-first century research networks’, European Journal of Human Genetics. Nature Publishing Group, 23(2), pp. 141–146. doi: 10.1038/ejhg.2014.71.
Kelly, K. and Wolf, G. (2007) ‘What is the quantified self’. Available at: https://web.archive.org/web/20100507215130/http://www.kk.org/quantifiedself/2007/10/what-is-the-quantifiable-self.php.
Kelly, R. (2020) ‘The Biggest ICO Fines Ever Issued’. Available at: https://digit.fyi/data-protection-2020-the-biggest-fines-ever-issued-by-the-ico/.
Kensing, F. and Blomberg, J. (1998) ‘Participatory design: Issues and concerns’, Computer supported cooperative work (CSCW). Springer, 7(3), pp. 167–185.
Klatzky, S. R. (1970) ‘Automation, size, and the locus of decision making: the cascade effect’, The Journal of Business. JSTOR, 43(2), pp. 141–151. Available at: https://www.jstor.org/stable/pdf/2352107.pdf?refreqid=excelsior%3A24bde6bf7de0eccf42c6ea11f8446d38.
Klein, B. et al. (2004) ‘Enabling flow - A paradigm for document-centered personal information spaces’, in Proceedings of the eighth iasted international conference on artificial intelligence and soft computing, pp. 187–192. Available at: https://www.semanticscholar.org/paper/Enabling-flow%3A-%7BA%7D-paradigm-for-document-centered-Klein-Agne/22be4a7b25e75de235e5d96bad6ab4ab4583daac.
Kostkova, P. (2015) ‘Grand Challenges in Digital Health’, Frontiers in Public Health. Frontiers Media SA, 3. doi: 10.3389/fpubh.2015.00134.
Kriisk, K. and Minas, R. (2017) ‘Social rights and spatial access to local social services: The role of structural conditions in access to local social services in Estonia’, Social Work and Society, 15(1). Available at: https://www.socwork.net/sws/article/view/503/1007.
Krishnan, A. (2010) ‘Pervasive Personal Information Spaces’. University of Waikato. Available at: https://researchcommons.waikato.ac.nz/handle/10289/4590.
Krishnan, A. and Jones, S. (2005) ‘TimeSpace: Activity-based temporal visualisation of personal information spaces’, Personal and Ubiquitous Computing, 9(1), pp. 46–65. doi: 10.1007/s00779-004-0291-x.
Lansdale, M. and Edmonds, E. (1992) ‘Using memory for events in the design of personal filing systems’, International Journal of Man-Machine Studies, 36(1), pp. 97–126. doi: 10.1016/0020-7373(92)90054-O.
Lansdale, M. W. (1988) ‘The psychology of personal information management’, Applied Ergonomics, 19(March 1988), pp. 55–66. doi: 10.1016/0003-6870(88)90199-8.
Larsson, S. (2018) ‘Algorithmic governance and the need for consumer empowerment in data-driven markets’, Internet Policy Review, 7(2). doi: 10.14763/2018.2.791.
Lecluijze, I. et al. (2015) ‘Co-production of ICT and children at risk: The introduction of the Child Index in Dutch child welfare’, Children and Youth Services Review. Elsevier Ltd, 56, pp. 161–168. doi: 10.1016/j.childyouth.2015.07.003.
Leprince-Ringuet, D. (2021). Available at: https://www.zdnet.com/article/gdpr-fines-increased-by-40-last-year-and-theyre-about-to-get-a-lot-bigger/.
Levine, R. (2011) ‘How the internet has all but destroyed the market for films, music and newspapers’. Available at: https://www.theguardian.com/media/2011/aug/14/robert-levine-digital-free-ride (Accessed: 23 March 2021).
Lewin, K. (1946) ‘Action Research and Minority Problems’, Journal of Social Issues, 2(4), pp. 34–46. doi: 10.1111/j.1540-4560.1946.tb02295.x.
Lewin, K. (1951) ‘Problems of research in social psychology’, Field theory in social science: Selected theoretical papers, pp. 155–169.
Li, I. (2009) ‘Designing Personal Informatics Applications and Tools that Facilitate Monitoring of Behaviors’, Uist. Available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.232.8536.
Li, I., Dey, A. and Forlizzi, J. (2010) ‘A stage-based model of personal informatics systems’, Proceedings of the 28th international conference on Human factors in computing systems CHI 10. New York, New York, USA: ACM Press, p. 557. doi: 10.1145/1753326.1753409.
Lindley, S. E. et al. (2018) ‘Exploring new metaphors for a networked world through the file biography’, Conference on Human Factors in Computing Systems - Proceedings, 2018-April, pp. 1–12. doi: 10.1145/3173574.3173692.
‘List of target companies for GDPR requests’ (no date). Available at: https://wiki.personaldata.io/wiki/Item:Q2369 (Accessed: 22 September 2021).
Lowe, T. and Wilson, R. (2015) ‘Playing the game of outcomes-based performance management’, Is Gamesmanship Inevitable.
Luger, E. and Rodden, T. (2013) ‘An informed view on consent for ubicomp’, in UbiComp 2013 - proceedings of the 2013 acm international joint conference on pervasive and ubiquitous computing. New York, New York, USA: ACM Press, pp. 529–538. doi: 10.1145/2493432.2493446.
Malomo, F. and Sena, V. (2017) ‘Data Intelligence for Local Government? Assessing the Benefits and Barriers to Use of Big Data in the Public Sector’, Policy and Internet, 9(1), pp. 7–27. doi: 10.1002/poi3.141.
Malone, T. W. (1983) ‘How do people organize their desks?: Implications for the design of office information systems’, ACM Transactions on Information Systems, 1(1), pp. 99–112. doi: 10.1145/357423.357430.
Mannay, D. and Morgan, M. (2015) ‘Doing ethnography or applying a qualitative technique? Reflections from the ‘waiting field’’, Qualitative research. Sage Publications Sage UK: London, England, 15(2), pp. 166–182.
Marshall, C. C. and Jones, W. (2006) ‘Keeping encountered information’, Communications of the ACM, 49(1), pp. 66–67. doi: 10.1145/1107458.1107493.
McCarthy, J. and Wright, P. (2004) ‘Technology as experience’, Interactions, 11(5), pp. 42–43. doi: 10.1145/1015530.1015549.
Miettinen, R. (2013) Innovation, human capabilities, and democracy: Towards an enabling welfare state. Oxford University Press.
Millar, S. (2002) ‘UK singled out for criticism over protection of privacy’. Available at: https://www.theguardian.com/technology/2002/sep/05/security.humanrights.
Moraveji, N. et al. (2007) ‘Comicboarding: Using comics as proxies for participatory design with children’, in Conference on human factors in computing systems - proceedings. ACM, pp. 1371–1374. doi: 10.1145/1240624.1240832.
Morozov, E. (2013) To save everything, click here: The folly of technological solutionism. Public Affairs.
Mortier, R. et al. (2013) ‘Challenges & opportunities in human-data interaction’, University of Cambridge, Computer Laboratory. Citeseer. doi: 10.5210/fm.v17i5.4013.
Mortier, R. et al. (2014) ‘Human-data interaction: The human face of the data-driven society’, Available at SSRN 2508051. doi: 10.2139/ssrn.2508051.
Murton, D. (2011) ‘A Brief History of the Evolution of Social Technology’. Available at: https://www.scottmonty.com/2011/04/brief-history-of-evolution-of-social.html.
MyData (2017) ‘Declaration - MyData.org’. Available at: https://mydata.org/declaration/ (Accessed: 8 November 2019).
‘MyData Comparison of Principles document’ (2017). Available at: http://bit.ly/pd-principles.
MyData.org (2018) ‘MyData - Who we are’. Available at: https://mydata.org/about/.
Mydex CIC (2010) ‘The Case for Personal Information Empowerment : The rise of the personal data store’, World, pp. 1–44.
‘myTimeline’ (2018). Available at: https://www.timelineinc.com/ (Accessed: 23 March 2021).
Nadeem, D. and Sauermann, L. (2007) ‘From Philosophy and Mental-Models to Semantic Desktop Research: Theoretical Overview’.
Neef, D. (2015) Digital exhaust: what everyone should know about big data, digitization and digitally driven innovation. Pearson Education.
Neff, G. (2013) ‘Why Big Data Won’t Cure Us’, Big Data, 1(3), pp. 117–123. doi: 10.1089/big.2013.0029.
Negroponte, N. and Bolt, R. A. (1978) Spatial data management system. MASSACHUSETTS INST OF TECH CAMBRIDGE ARCHITECTURE MACHINE GROUP.
Nelson, T. (2006) ‘Lost in hyperspace’, New Scientist, 191(2561). doi: 10.1002/elsc.200620112.
Nelson, T. H. (1965) ‘Complex information processing’, pp. 84–100. doi: 10.1145/800197.806036.
Norman, D. A. and Draper, S. W. (1986) ‘User Centered System Design; New Perspectives on Human-Computer Interaction’. L. Erlbaum Associates Inc.
Odom, W. et al. (2018) ‘Time, Temporality, and Slowness’, pp. 383–386. doi: 10.1145/3197391.3197392.
O’Donnell, B. (2020) ‘Zoom, the office and the future: What will work look like after coronavirus?’ Available at: https://eu.usatoday.com/story/tech/columnist/2020/09/07/zoom-work-from-home-future-office-after-coronavirus/5680284002/.
O’Donoghue, T. and Rabin, M. (2001) ‘Choice and procrastination’, The Quarterly Journal of Economics. MIT Press, 116(1), pp. 121–160.
OFSTED (2015) Early help: whose responsibility?, p. 32. Available at: www.ofsted.gov.uk https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/410378/Early_help_whose_responsibility.pdf.
Organisation for Economic Co-operation and Development (1980) OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data. Available at: https://www.oecd.org/digital/ieconomy/oecdguidelinesontheprotectionofprivacyandtransborderflowsofpersonaldata.htm.
‘Our Values’ (no date). Available at: https://www.citizenme.com/about/our-values (Accessed: 31 March 2021).
Papert, S. (1980) ‘Mindstorms: children, computers, and powerful ideas’. Basic Books, Inc.
Peikoff, L. (1993) Objectivism: The Philosophy of Ayn Rand. Penguin Publishing Group (Ayn rand library). Available at: https://books.google.co.uk/books?id=G6DDlqNftGcC.
Perez, S. (2018) ‘Facebook is shutting down Friend List Feeds’. Available at: https://techcrunch.com/2018/08/09/facebook-is-shutting-down-friend-list-feeds-today/.
Pink, S. et al. (2013) ‘Applying the lens of sensory ethnography to sustainable hci’, ACM Transactions on Computer-Human Interaction (TOCHI). ACM New York, NY, USA, 20(4), pp. 1–18.
Pollock, R. (2011) ‘Building the (Open) Data Ecosystem – Open Knowledge Foundation Blog’. Available at: https://blog.okfn.org/2011/03/31/building-the-open-data-ecosystem/ (Accessed: 23 July 2019).
Pór, G. (1997) ‘Designing Knowledge Ecosystems for Communities of Practice’, in Advancing organizational capability via knowledge management.
Price Ball, M. (no date) ‘About Us’. Available at: https://www.openhumans.org/about/ (Accessed: 31 March 2021).
‘Privacy’ (no date). Available at: https://privacy.linkedin.com/ (Accessed: 9 August 2021).
‘Privacy - Apple (UK)’ (no date). Available at: https://www.apple.com/uk/privacy/ (Accessed: 9 August 2021).
‘Privacy & Terms – Google’ (no date). Available at: https://policies.google.com/ (Accessed: 9 August 2021).
Puussaar, A., Clear, A. K. and Wright, P. (2017) ‘Enhancing Personal Informatics Through Social Sensemaking’, Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI ’17. Association for Computing Machinery, 2017-May, pp. 6936–6942. doi: 10.1145/3025453.3025804.
Quinn, P. (2021) ‘Research under the GDPR–a level playing field for public and private sector research?’, Life Sciences, Society and Policy. Springer, 17(1), pp. 1–33.
Raskin, J. (2000) The humane interface: new directions for designing interactive systems. Addison-Wesley Professional.
Reason, P. and Bradbury, H. (2001) Handbook of action research: Participative inquiry and practice. Sage.
Ries, E. (2011) Wiki: The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown. Available at: http://en.wikipedia.org/wiki/Lean_Startup.
Rivera-Pelayo, V. et al. (2012) ‘A framework for applying Quantified Self approaches to support reflective learning’, Proceedings of the IADIS International Conference Mobile Learning 2012, ML 2012, pp. 123–131.
Roche, M. (2011) ‘Full internet ban for sex offenders ruled unlawful’. Available at: https://ukhumanrightsblog.com/2011/08/12/full-internet-ban-for-sex-offenders-ruled-unlawful/ (Accessed: 23 March 2021).
Rogers, Y. (2006) ‘Moving on from Weiser’s Vision of Calm Computing: Engaging UbiComp Experiences’, LNCS, 4206, pp. 404–421. Available at: http://www.inf.ufg.br/$\sim$vagner/courses/mobilecomputing/docs/papers/03-Rogers_Ubicomp06.pdf.
Ross, G. (2005) ‘An introduction to Tim Berners-Lee’s Semantic Web’. Available at: https://www.techrepublic.com/article/an-introduction-to-tim-berners-lees-semantic-web/.
Saha, D. and Mukherjee, A. (2003) ‘Pervasive computing: A paradigm for the 21st century’. IEEE. doi: 10.1109/MC.2003.1185214.
Sauermann, L., Bernardi, A. and Dengel, A. (2005) ‘Overview and outlook on the semantic desktop’, in CEUR workshop proceedings.
Savage, A. and Hyde, R. (2014) ‘Using freedom of information requests to facilitate research’, International Journal of Social Research Methodology. Routledge, 17(3), pp. 303–317. doi: 10.1080/13645579.2012.742280.
Schumacher, K., Sintek, M. and Sauermann, L. (2008) ‘Combining fact and document retrieval with spreading activation for semantic desktop search’, in Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp. 569–583. doi: 10.1007/978-3-540-68234-9_42.
Searls, D. (2008) ‘The Intention Economy: What Happens When Customers Get Real Power’. Available at: https://web.archive.org/web/20101226073246/http://cyber.law.harvard.edu/sites/cyber.law.harvard.edu/files/2009_03_24_lunchtalk.ppt.
Searls, D. (2012) The intention economy: when customers take charge. Harvard Business Press (04), pp. 50–2168–50–2168. doi: 10.5860/choice.50-2168.
Seligman, C. and Darley, J. M. (1976) ‘Feedback as a means of decreasing residential energy consumption’, Journal of Applied Psychology, 62(4), pp. 363–368. doi: 10.1037/0021-9010.62.4.363.
Shannon, C. E. (1948) ‘A mathematical theory of communication’, The Bell system technical journal. Nokia Bell Labs, 27(3), pp. 379–423.
Shilton, K. (2011) ‘Participatory Personal Data: An Emerging Research Challenge for the Information Sciences’, Advances in Information Science.
Shipman, F. M. (. and Marshall, C. C. (1999) ‘Formality Considered Harmful : Experiences , Emerging Themes , and Directions on the Use of Formal Representations in Interactive Systems’, pp. 333–352.
Shneiderman, B. (1996) The Eyes Have It: A Task by Data Type Taxonomy for Information Visualisations.
Siegel, D. (2009) Pull: The power of the semantic web to transform your business. Penguin.
Siegel, D. (2010) ‘Personal Data Locker Vision Video’. Available at: https://vimeo.com/14061238.
Siegler, M. G. (2011) ‘Facebook Unveils Timeline: The Story Of Your Life On A Single Page’. Available at: https://techcrunch.com/2011/09/22/facebook-timeline/ (Accessed: 21 March 2021).
Simon, H. A. (1971) ‘Designing Organizations for an Information-Rich World’, Computers, Communication, and the Public Interest., pp. 37–72.
Simon, H. A. and Newell, A. (1958) ‘Heuristic Problem Solving: The next advance in operations research’. doi: 10.1057/978-1-349-94848-2_792-1.
Smith, N. K. (2011) Immanuel Kant’s critique of pure reason. Read Books Ltd.
Smith, R. C., Bossen, C. and Kanstrup, A. M. (2017) ‘Participatory design in an era of participation’, CoDesign. Taylor & Francis, 13(2), pp. 65–69. doi: 10.1080/15710882.2017.1310466.
Soja, E. (2015) ‘Supporting Healthcare of the Elderly through ICT: Socio-demographic Conditions and Digital Inclusion’, in Knowledge economy society - challenges and development trends of modern economy, finance and information technology., pp. 279–290.
Spagnuelo, D., Ferreira, A. and Lenzini, G. (2019) ‘Accomplishing Transparency within the General Data Protection Regulation.’, in ICISSP, pp. 114–125.
Spector, P. E. (1982) ‘Behavior in organizations as a function of employee’s locus of control’, Psychological Bulletin, 91(3), pp. 482–497. doi: 10.1037/0033-2909.91.3.482.
Spencer, D. and Warfel, T. (2004) ‘Card sorting: A definitive guide’, Boxes and arrows, 2(2004), pp. 1–23.
Spiekermann, S. and Korunovska, J. (2017) ‘Towards a value theory for personal data’, Journal of Information Technology, 32(1), pp. 62–84. doi: 10.1057/jit.2016.4.
Spinuzzi, C. (2005) ‘The methodology of participatory design’, Technical Communication. Society for Technical Communication, 52(2), pp. 163–174.
Star, S. L. (1989) ‘The Structure of Ill-Structured Solutions: Boundary Objects and Heterogeneous Distributed Problem Solving’, in Distributed artificial intelligence. Elsevier, pp. 37–54. doi: 10.1016/b978-1-55860-092-8.50006-x.
Star, S. L. (2010) ‘This is not a boundary object: Reflections on the origin of a concept’, Science Technology and Human Values, 35(5), pp. 601–617. doi: 10.1177/0162243910377624.
Steinberg, S. G. (1997) ‘Lifestreams’, Wired. Available at: https://www.wired.com/1997/02/lifestreams/.
Steyaert, J. and Gould, N. (2009) ‘Social work and the changing face of the digital divide’, British Journal of Social Work, 39(4), pp. 740–753. doi: 10.1093/bjsw/bcp022.
Symons, T. et al. (2017) ‘Me, my data and I: The future of the personal data economy’, DECODE (DEecentralised Citizen Owned Data Ecosystems) Report, (732546), p. 88. Available at: https://media.nesta.org.uk/documents/decode-02.pdf.
Taylor, L. (2017) ‘What is data justice? The case for connecting digital rights and freedoms globally’, Big Data and Society, 4(2). doi: 10.1177/2053951717736335.
Teevan, J. et al. (2004) ‘The perfect search engine is not enough: A study of orienteering behavior in directed search’, in Conference on human factors in computing systems - proceedings, pp. 415–422. Available at: http://people.csail.mit.edu/teevan/work/publications/papers/chi04.pdf.
Teevan, J. B. (2001) ‘Displaying dynamic information’, in Conference on human factors in computing systems - proceedings, pp. 417–418. doi: 10.1145/634067.634311.
Terdiman, D. (2008) ‘Using tags to improve the Flickr experience’. Available at: https://www.cnet.com/news/using-tags-to-improve-the-flickr-experience/.
The European Parliament and the Council of the European Union (2016a) ‘Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data’, pp. 16–32. Available at: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32016R0679 https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679&from=ES.
The European Parliament and the Council of the European Union (2016b) ‘Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data’. Available at: https://eur-lex.europa.eu/eli/reg/2016/679/oj https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679&from=ES.
‘The GDPR: Does it Benefit Consumers in Any Practical Way?’ (2020). Atebits.com. Available at: https://www.atebits.com/the-gdpr-does-it-benefit-consumers-in-any-practical-way/.
‘The personal computer revolution’ (no date) in Britannica. Available at: https://www.britannica.com/technology/computer/The-personal-computer-revolution.
Timely (2020) ‘The attention economy: what it is, what it’s doing to you’. Available at: https://memory.ai/timely-blog/the-attention-economy.
Toonders, J. (2014) ‘Data Is the New Oil of the Digital Economy’. Available at: https://www.wired.com/insights/2014/07/data-new-oil-digital-economy/.
Tregeagle, S. and Darcy, M. (2008) ‘Child welfare and information and communication technology: Today’s challenge’, British Journal of Social Work, 38(8), pp. 1481–1498. doi: 10.1093/bjsw/bcm048.
Tufekci, Z. (2017) ‘We’re building a dystopia just to make people click on ads’. TED. Available at: https://www.ted.com/talks/zeynep_tufekci_we_re_building_a_dystopia_just_to_make_people_click_on_ads.
Tunikova, O. (2018) ‘Are We Consuming Too Much Information?’ Available at: https://medium.com/@tunikova_k/are-we-consuming-too-much-information-b68f62500089 (Accessed: 23 March 2021).
US Department of Health Education and Welfare (1973) ‘Records Computers and the Rights of Citizens’.
Various Authors (2018) ‘Our Digital Lives’, in TED talks. TED. Available at: https://www.ted.com/playlists/26/our_digital_lives.
Vlachokyriakos, V. et al. (2016) ‘Digital civics: Citizen empowerment with and through technology’, Conference on Human Factors in Computing Systems - Proceedings, 07-12-May-, pp. 1096–1099. doi: 10.1145/2851581.2886436.
Wagner, A. (2012) ‘Is internet access a human right?’ Available at: https://www.theguardian.com/law/2012/jan/11/is-internet-access-a-human-right (Accessed: 23 March 2021).
Waldman, A. E. (2020) ‘Data Protection by Design ? A Critique of Article 25 of the GDPR’, 1239(2019), pp. 147–168.
Wallace, D. P. (2007) Knowledge management: Historical and cross-disciplinary themes. Libraries unlimited.
Weiser, M. (1991) ‘The computer for the 21st century’, Scientific American, 265(3), pp. 94–105. doi: 10.1145/329124.329126.
Weiser, M. and Brown, J. S. (1996) ‘The coming age of calm technology’, Beyond Calculation: The Next Fifty Years of Computing. Available at: http://www.teco.edu/lehre/ubiq/ubiq2000-1/calmtechnology.htm http://link.springer.com/content/pdf/10.1007/978-1-4612-0685-9_6.pdf%5Cnpapers2://publication/uuid/F86D6ECE-A71E-4D20-A47B-9AF86A84923D.
Wellisch, H. H. (1996) Abstracting, indexing, classification, thesaurus construction: A glossary. American Society of Indexers.
Whittaker, S. and Hirschberg, J. (2001) ‘The Character, Value, and Management of Personal Paper Archives’, ACM Transactions on Computer-Human Interaction, 8(2), pp. 150–170. doi: 10.1145/376929.376932.
‘Whose data is it anyway?’ (2019). 04: UBDI. Available at: https://www.ubdi.com/blog/whose-data-is-it-anyway (Accessed: 31 March 2021).
Wiki.personaldata.io (no date) ‘Subject Access Request Template’. Available at: https://wiki.personaldata.io/wiki/Template:Access (Accessed: 21 September 2021).
Williams, H. et al. (2015) ‘Dynamic consent: a possible solution to improve patient confidence and trust in how electronic patient records are used in medical research.’, JMIR medical informatics. JMIR Publications Inc., 3(1), p. e3. doi: 10.2196/medinform.3525.
Wilson, L., Wilson, R. and Martin, M. (2020) Health and Care Practitioner Insights: Understanding Information Sharing in Constellations of Care - Report on Amy’s Page workshop series. Great North Care Record. Available at: https://www.greatnorthcarerecord.org.uk.
Wilson, R. et al. (2011) ‘Re-Mixing Digital Economies in the Voluntary Community Sector? Governing Identity Information and Information Sharing in the Mixed Economy of Care for Children and Young People*’, Social Policy and Society. Cambridge University Press, 10(3), pp. 379–391. doi: 10.1017/s1474746411000108.
‘WinFS’ (no date). Available at: https://en.wikipedia.org/wiki/WinFS.
Wong, J. and Henderson, T. (2018) ‘How Portable is Portable ? Exercising the GDPR ’ s Right to Data Portability’, Acm, pp. 911–920.
Woolgar, S. (2014) ‘Configuring the User: The Case of Usability Trials’, The Sociological Review, 38, pp. 58–99. doi: 10.1111/j.1467-954x.1990.tb03349.x.
Wright, P. and McCarthy, J. (2008) ‘Empathy and experience in HCI’, Conference on Human Factors in Computing Systems - Proceedings, pp. 637–646. doi: 10.1145/1357054.1357156.
Xie, A., Ho, J. C. F. and Wang, S. J. (2021) ‘Data City: Leveraging Data Embodiment Towards Building the Sense of Data Ownership’, pp. 365–378. doi: 10.1007/978-3-030-73426-8_22.
Zichichi, M., Ferretti, S. and D’Angelo, G. (2020) ‘On the Efficiency of Decentralized File Storage for Personal Information Management Systems’. Available at: http://arxiv.org/abs/2007.03505.
Zins, C. (2015) ‘What is the meaning of "data", "information", and "knowledge"?’, Institute of Knowledge Sharing, 3(1).
Ziogas, G. (2020) ‘The Inventor of the World Wide Web Says the Internet Is Broken’. Available at: https://medium.com/digital-diplomacy/the-inventor-of-the-world-wide-web-says-the-internet-is-broken-fbce1c8bf6cf.
Zuboff, S. (2019) The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. Profile. Available at: https://books.google.co.uk/books?id=W7ZEDgAAQBAJ.
Zuckerman, E. (2021) Mistrust: Why Losing Faith In Institutions Provides The Tools To Transform Them. New York, NY, USA: W. W. Norton & Company, pp. 1–3. doi: 10.1017/ipo.2021.30.
Zúñiga, H. de, Garcia-Perdomo, V. and McGregor, S. C. (2015) ‘What is second screening? Exploring motivations of second screen use and its effect on online political participation’, Journal of communication. Oxford University Press, 65(5), pp. 793–815.
The usage of the abbreviation PIMS here is not to be confused with its use to refer to “Personal Information Management Systems” in traditional PIM terminology.↩︎
Note that Case Study Three (Desiging Personal Data Interfaces) involved no participants which is why it does not have its own table in this section.↩︎
One participant withdrew from the study after the first interview of the Guided GDPR study due to COVID-19. The other 10 participants took part in all three stages.↩︎
(with one exception - the staff workshops within Case Study Two. Because the participants were attending the workshops through their employers (the local authorities), we were not allowed to provide vouchers for participation.)↩︎
The term ‘Troubled Families’, popularised by the TFP, has fallen from use, as it was considered to be negative and judgemental. A latter term ‘vulnerable families’ has also been criticised for being disempowering. Most councils now refer simply to ‘families’ or sometimes ‘supported families’, and the rest of this thesis adopts this convention.↩︎
Some leisure categories (namely Shopping and Transport) were included that are not strictly civic data, as these are useful for exploring issues of ethics and helping participants to have a reference point when discussing the “big data” benefits of data linking.↩︎
The first of these interviews was a ‘trial run’ with a couple selected by convenience sample, and conducted in a University meeting room not their home at the participants’ request.↩︎
The notation used for the quote references is as follows:
The number after FQ/CQ/SQ provides a unique identifier for each quote, which can be used to look up the referenced quote in [INSERT REF TO APPENDIX SECTION HERE]. Individual speakers are identified only by their role: within each quote, or in brackets afterwards, the speakers are identified as Worker, Parent, Child, or Researcher.↩︎
As judged at the time of the workshops, summer 2018.↩︎
11 participants started the study but one dropped out after the first interview due to COVID-19, so only 10 participants conducted GDPR requests. 31 interviews were conducted in all.↩︎
In this study and throughout this thesis, my usage of the word ‘want’ in the context of data capabilities deliberately includes both meanings of the word: the need or desire of the individual, but also that which they lack.↩︎
At the time of writing (autumn 2021) the GPDR legally applies in both the European Union and the United Kingdom, which have a total population of 513 million individuals [37]. GDPR rights are also conferred to any individual who is a customer of businesses with registered offices in EU or UK countries, meaning that these rights are in effect globally available for non-EU, non-UK users of many multi-national digital service providers.↩︎